Ari Morcos
Matthew Leavitt
Bogdan Gaza
AI models are a reflection of the data on which they are trained – models are what they eat. The specifics of the data on which a model is trained generally has a far greater impact on that model’s performance than the specifics of the model’s architecture. Training models on the right data can drive dramatic improvements in model performance and – just as importantly – in model efficiency.
Data curation for self-supervised learning remains a challenging, cutting-edge research problem, one that only the world’s top AI research labs are able to do effectively. Until now. DatologyAI was founded to democratize this critical part of the AI infrastructure stack, making model training more accessible to all through better data.