Pachyderm
Stand No: 195

What would data analytics infrastructure (namely Hadoop) look like if we rebuilt it from scratch today? We think it would be containerized, modular, and easy enough for a single person to use while still being scalable enough for a whole company. Tools like Docker and Kubernetes provide the perfect building blocks for us revolutionize data infrastructure!

Pachyderm is “Git for Data Science.” We offer complete version control for data and give your data science team the same first-class development tools as software developers. Pachyderm is ideal for building machine learning pipelines and ETL workflows because we track every model/output directly to the raw input datasets that created it (aka: Provenance).

Since everything in Pachyderm is a container, data scientists can use any languages or libraries they want (e.g. Spark, R, Python, OpenCV, etc) without any additional infrastructure overhead.

Speaking at the event...

Nick Harvey

Lead Developer Advocate

Pachyderm

12.20pm - Day 1

The no doubt very long winded title that you choose to give to this agenda topic.

ok11058
  • Real World AI Auditing, Tracking and Transparency
  • Pachyderm

12.20pm - Day 1

The no doubt very long winded title that you choose to give to this agenda topic.

Powered by WishList Member - Membership Software