Sample Parquet File Download — Free Apache Parquet for Testing
Download free Apache Parquet example files from 100KB to 50MB — Snappy, GZIP, and uncompressed variants. These Parquet test files are built for data engineers and analysts working with Spark, Pandas, BigQuery, Athena, DuckDB, and Snowflake. Use them as parquet files for testing data lake ingestion, ETL pipelines, and columnar query performance.
sample-100kb.parquet
1,100 rows · SNAPPY
Verified file details
- Filename
- sample-100kb.parquet
- Exact size
- 103,210 bytes
- Displayed size
- 101 KB
- MIME type
- application/octet-stream
- Rows
- 1,100
- Columns
- 6
- Codec
- SNAPPY
- Note
- simple-flat
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-100kb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-500kb.parquet
3,300 rows · SNAPPY
Verified file details
- Filename
- sample-500kb.parquet
- Exact size
- 520,983 bytes
- Displayed size
- 509 KB
- MIME type
- application/octet-stream
- Rows
- 3,300
- Columns
- 10
- Codec
- SNAPPY
- Note
- nested-schema
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-500kb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-1mb.parquet
5,000 rows · GZIP
Verified file details
- Filename
- sample-1mb.parquet
- Exact size
- 1,105,453 bytes
- Displayed size
- 1.05 MB
- MIME type
- application/octet-stream
- Rows
- 5,000
- Columns
- 12
- Codec
- GZIP
- Note
- with-nulls
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-1mb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-5mb.parquet
22,000 rows · SNAPPY
Verified file details
- Filename
- sample-5mb.parquet
- Exact size
- 5,393,543 bytes
- Displayed size
- 5.14 MB
- MIME type
- application/octet-stream
- Rows
- 22,000
- Columns
- 15
- Codec
- SNAPPY
- Note
- large-columns
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-5mb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-10mb.parquet
44,000 rows · SNAPPY
Verified file details
- Filename
- sample-10mb.parquet
- Exact size
- 10,780,760 bytes
- Displayed size
- 10.28 MB
- MIME type
- application/octet-stream
- Rows
- 44,000
- Columns
- 15
- Codec
- SNAPPY
- Note
- production-like
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-10mb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-50mb.parquet
150,000 rows · SNAPPY
Verified file details
- Filename
- sample-50mb.parquet
- Exact size
- 55,536,645 bytes
- Displayed size
- 52.96 MB
- MIME type
- application/octet-stream
- Rows
- 150,000
- Columns
- 20
- Codec
- SNAPPY
- Note
- stress-test
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-50mb.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-uncompressed.parquet
2,000 rows · NONE
Verified file details
- Filename
- sample-uncompressed.parquet
- Exact size
- 203,918 bytes
- Displayed size
- 199 KB
- MIME type
- application/octet-stream
- Rows
- 2,000
- Columns
- 8
- Codec
- NONE
- Note
- uncompressed
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-uncompressed.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
sample-gzip.parquet
3,000 rows · GZIP
Verified file details
- Filename
- sample-gzip.parquet
- Exact size
- 305,642 bytes
- Displayed size
- 298 KB
- MIME type
- application/octet-stream
- Rows
- 3,000
- Columns
- 8
- Codec
- GZIP
- Note
- gzip-compressed
- License
- CC0 / Public Domain
- Download URL
- https://truefilesize.com/files/parquet/sample-gzip.parquet
See how TrueFileSize generates and measures sample files, or review the editorial policy.
Use cases for sample Parquet files
- Testing Parquet readers (pyarrow, DuckDB, Spark, pandas)
- Benchmarking Parquet vs CSV read performance
- Testing data lake ingestion pipelines (S3, GCS, ADLS)
- Verifying Parquet schema evolution and compatibility
- Testing BI tool Parquet import (Tableau, Power BI, Metabase)
- Validating Snappy vs GZIP compression handling
Parquet vs CSV vs JSON for analytics
| Feature | Parquet | CSV | JSON |
|---|---|---|---|
| Storage layout | Columnar | Row-based | Row-based |
| File size (1M rows) | ~50 MB | ~200 MB | ~400 MB |
| Column pruning | Yes (read only needed cols) | No (read all) | No (read all) |
| Schema enforcement | Yes (typed columns) | No (all strings) | Partial |
| Predicate pushdown | Yes (row group stats) | No | No |
| Human readable | No (binary) | Yes | Yes |
| Best for | Analytics, data lakes, ML | Data exchange, imports | APIs, configs |
How to read and write Parquet files
# Python (pandas + pyarrow — most common)
import pandas as pd
df = pd.read_parquet('data.parquet')
df.to_parquet('output.parquet', engine='pyarrow')
# Python (polars — faster alternative)
import polars as pl
df = pl.read_parquet('data.parquet')
# DuckDB (SQL on Parquet — zero copy)
duckdb.sql("SELECT * FROM 'data.parquet' WHERE age > 30")
duckdb.sql("COPY (SELECT * FROM my_table) TO 'out.parquet'")
# Apache Spark
df = spark.read.parquet("s3://bucket/data.parquet")
# CLI inspection (parquet-tools / pqrs)
parquet-tools schema data.parquet
parquet-tools head data.parquet
pqrs schema data.parquetParquet compression codecs
| Codec | Ratio | Speed | When to use |
|---|---|---|---|
| Snappy | Good | Very fast | Default — best balance (Spark, DuckDB) |
| GZIP | Best | Slow | Long-term storage, bandwidth-limited |
| ZSTD | Best | Fast | Modern alternative to GZIP (Spark 3+) |
| None | 1:1 | Fastest | Testing, already-compressed data |
Technical specifications
| Full name | Apache Parquet |
| Extension | .parquet |
| Type | Columnar binary storage format |
| Magic bytes | PAR1 (header and footer) |
| Compression | Snappy (default), GZIP, ZSTD, LZ4, Brotli, None |
| Encoding | Dictionary, RLE, Delta, Bit-packing |
| Nested types | Dremel-style repetition/definition levels |
| Developed by | Twitter + Cloudera (2013), Apache project |
Frequently Asked Questions
What is Apache Parquet?
Parquet vs CSV — Which is better for data?
Parquet vs Avro — What is the difference?
How to read Parquet file?
Parquet compression types — Snappy vs GZIP vs ZSTD?
How do I convert CSV to Parquet?
Other data formats
Related reading
Mocking REST APIs with JSON Fixtures
Fast frontend iteration without a backend. MSW, json-server, and sample fixtures for users, products, and nested objects. Copy-paste examples.
Sample JSON Data for API Testing and Mocking
Free sample JSON files for testing REST APIs. Users, products, nested objects, GeoJSON, and API response wrappers with code examples.
Seeding Test Databases with Sample Data — SQL, JSON, CSV
How to seed development and staging databases using sample SQL dumps, JSON files, and CSV imports from TrueFileSize. Covers PostgreSQL, MySQL, SQLite, MongoDB, and Prisma.