Installieren Sie die genialokal App auf Ihrem Startbildschirm für einen schnellen Zugriff und eine komfortable Nutzung.
Tippen Sie einfach auf Teilen:
Und dann auf "Zum Home-Bildschirm [+]".
Bei genialokal.de kaufen Sie online bei Ihrer lokalen, inhabergeführten Buchhandlung!
"Practical Parquet Engineering"
"Practical Parquet Engineering" is an authoritative and comprehensive guide to mastering the design, implementation, and optimization of Apache Parquet, the industry-standard columnar storage format for big data analytics. Beginning with the architectural fundamentals, the book elucidates Parquet's design philosophy and core principles, providing a nuanced understanding of its logical and physical models. Readers will benefit from in-depth comparisons to alternative formats like ORC and Avro, along with explorations of schema evolution, metadata management, and the unique benefits of self-describing storage-making this an essential reference for anyone seeking to build resilient and efficient data infrastructure.
Moving from theory to hands-on application, the book offers actionable best practices for both writing and querying Parquet at scale. Topics such as file construction, encoding strategies, compression, and partitioning are addressed with precision, alongside nuanced guidance for language-specific implementations and optimizing data pipelines in distributed and cloud environments. Advanced chapters cover real-world performance tuning, including benchmarking, profiling, cache strategies, and troubleshooting complex bottlenecks in production. Readers will also learn how to leverage Parquet's rich metadata and statistics for query acceleration, and how to integrate seamlessly with modern analytics frameworks like Spark, Presto, and Hive.
Addressing emerging requirements around security, compliance, and data quality, "Practical Parquet Engineering" goes beyond functionality to cover data governance, encryption, access control, and regulatory mandates like GDPR and HIPAA. Dedicated chapters on validation, testing, and quality management socialize industry-strength patterns for ensuring correctness and resilience. The book culminates in advanced topics, custom engineering extensions, and a diverse suite of case studies from enterprise data lakes, global analytics, IoT, and hybrid-cloud architectures, making it an indispensable resource for data engineers, architects, and technical leaders aiming to future-proof their data platforms with Parquet.