About 50 results
Open links in new tab
  1. What are the pros and cons of the Apache Parquet format …

    Apr 24, 2016 · 30,36,2 Parquet files are most commonly compressed with the Snappy compression algorithm. Snappy compressed files are splittable and quick to inflate. Big data …

  2. Inspect Parquet from command line - Stack Overflow

    How do I inspect the content of a Parquet file from the command line? The only option I see now is $ hadoop fs -get my-path local-file $ parquet-tools head local-file | less I would like to avoid

  3. Is it possible to read parquet files in chunks? - Stack Overflow

    Nov 29, 2019 · The Parquet format stores the data in chunks, but there isn't a documented way to read in it chunks like read_csv. Is there a way to read parquet files in chunks?

  4. Extension of Apache parquet files, is it '.pqt' or '.parquet'?

    Oct 19, 2021 · I wonder if there is a consensus regarding the extension of parquet files. I have seen a shorter .pqt extension, which has typical 3-letters (like in csv, tsv, txt, etc) and then …

  5. How to read a Parquet file into Pandas DataFrame?

    How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a …

  6. Microsoft Fabric-Delta parquet format? - Stack Overflow

    Oct 12, 2023 · Delta-Parquet is the same as parquet type (Compressed) with the addition of its underlying storage, adds metadata for features like ACID transactions, versioning, and …

  7. What file extension is the correct way to name parquet files?

    Jun 3, 2025 · <file-name>.parquet : 1) This is the standard and most widely accepted naming convention. 2) The compression codec is stored in the Parquet file metadata, not in the …

  8. Reading / Fixing a corrupt parquet file - Stack Overflow

    Sep 3, 2024 · Reading / Fixing a corrupt parquet file Asked 1 year, 3 months ago Modified 7 months ago Viewed 2k times

  9. Spark parquet partitioning : Large number of files

    Jun 28, 2017 · I am trying to leverage spark partitioning. I was trying to do something like data.write.partitionBy ("key").parquet ("/location") The issue here each partition creates huge …

  10. Create Parquet files from stream in python in memory-efficient …

    Nov 11, 2020 · 16 It appears the most common way in Python to create Parquet files is to first create a Pandas dataframe and then use pyarrow to write the table to parquet. I worry that this …