A point that differs from our experience struck us in the recent note regarding doing data science in Python:
A development environment [for Python] specifically tailored to the data science sector on the level of RStudio, for example, does not (yet) exist.“What’s the Best Statistical Software? A Comparison of R, Python, SAS, SPSS and STATA” Amit Ghosh
Actually, Python has a large number of very capable integrated development environments, some of which are specifically tailored for data science. Please read on for a small list of tools, and my recommendations for a specific data science in Python toolchain.
Off the top of my head I remember the following Python tools:
- PyCharm, both Community Edition “The Python IDE for Professional Developers”, and Professional Edition “For both Scientific and Web Python development. With HTML, JS, and SQL support”. This IDE has amazing re-factoring and completion abilities, and automatically criticizes your code relative the PEP8 code style recommendations.
- Black “The uncompromising code formatter”.
- JupyterLab “a web-based interactive development environment for Jupyter notebooks, code, and data” (the successor to Jupyter Notebook and IPython Notebook).
- The Anaconda distribution, a great package set and package manager.
- Spyder “a powerful scientific environment written in Python”.
- Apache Zeppelin “Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more”.
- VS Code for Python a Python IDE based on Visual Studio.
- PyDev an Eclipse based Python IDE.
- elpy “Emacs Python Development Environment”.
- Dash “a framework for building analytical web applications”.
My current “data science in Python” goto tools are: PyCharm, JupyterLab, Black, and Anaconda. PyCharm is one of the best IDEs I have seen, JupyterLab notebooks are good for capturing reproducible research and mixing documentation and code, Black greatly improves your code, and Anaconda makes environment management easy.
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.