Democratizing Data Access via NVIDIA NGC

October 23, 2022

Democratizing Data Access via NVIDIA NGCLisa Marie ElsAs a key step in democratizing access to data, DefinedCrowd will provide dataset samples through the NVIDIA NGC catalog, a GPU-optimized hub for AI and HPC containers, pre-trained models and SDKs that simplifies and accelerates end-to-end workflows. Introducing DefinedCrowd, a One-Stop-Shop for AI Training DataOur core business is providing high-quality AI training data to companies building world-class AI solutions. Our customers can access this data either through DefinedData, an online marketplace of off-the-shelf AI training data, available in multiple languages, domains, and recording types. The data is built of scripted speech data collected by the DefinedCrowd Neevo platform from several speakers in the UK (crowd members from DefinedCrowd). Step 4: Data PreparationAfter downloading the speech data from DefinedCrowd API, we need to adapt it for the format expected by NeMo for ASR training.

The source of this news is from Defined AI