Discover reviews on "real datascience project step by step" based on Reddit discussions and experiences.
Last updated: October 5, 2024 at 06:20 AM
Real Data Science Project Step by Step
Python Code Optimization and Alternatives
- PyPy: A suggestion to use PyPy for faster code execution, quoting Guido van Rossum.
- Optimizing Python Code: Recommendations to use profilers, read files in larger chunks, consider mmap, reduce excessive C FFI calls, and use memory allocators like tcmalloc for improvements.
- Cython and Numba: Mentioned for speeding up Python code, especially for algorithms where vectorization is not possible.
- Julia: Suggested as an alternative for data science projects.
- Rust Implementation: Comments on rewriting code in Rust for better performance compared to Python.
Rust Learning Resources
- Rust Learning: Recommendations included the Official Rust book, Exercism, rustlings exercises, and Rust in Action.
- Rust for Python Extensions: Positive feedback on using Rust for Python Extensions.
- PyO3: Success stories shared regarding rapid prototyping in Python and transitioning to Rust using PyO3 for performance bottlenecks.
Code Integration with PyO3
- PyO3 Integration: Concerns raised about low overhead for PyO3 integration and frustration with creating new Rust modules each time.
- Workflow: Users discussed ideal Workflows for small Rust additions to Python programs using PyO3.
General Thoughts
- Native Code Performance: Recognition that native code outperforms scripting languages like Python.
- Testimonial: Rust praised for heavy custom processing tasks and potential to change perceptions about its capabilities.
The rest of the comments did not directly relate to the query about real data science projects step by step.