R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees
2025-04-10
R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket....
\n
Internally, an Iceberg table is a collection of data files (typically stored in columnar formats like Parquet or ORC) and metadata files (typically stored in JSON or Avro) that describe table snapshots, schemas, and partition layouts.
To understand how query engines interact efficiently with Iceberg tables, it helps to look at an Iceberg metadata file (simplified):
\n{\n "format-version": 2,\n "table-uuid": "0195e49b-8f7c-7933-8b43-d2902c72720a",\n "location": "s3://my-bucket/warehouse/0195e49b-79ca/table",\n "current-schema-id": 0,\n "schemas": [\n {\n "schema-id": 0,\n "type": "struct",\n "fields": [\n { "id": 1, "name": "id", "required": false, "type": "long" },\n { "id": 2, "name": "data", "required": false, "type": "string" }\n ]\n }\n ],\n "current-snapshot-id": 3567362634015106507,\n "snapshots": [\n {\n "snapshot-id": 3567362634015106507,\n "sequence-number": 1,\n "timestamp-ms": 1743297158403,\n "manifest-list": "s3://my-bucket/warehouse/0195e49b-79ca/table/metadata/snap-3567362634015106507-0.avro",\n "summary": {},\n "schema-id": 0\n }\n ],\n "partition-specs": [{ "spec-id": 0, "fields": [] }]\n}
\n A few of the important components are:
schemas
: Iceberg tracks schema changes over time. Engines use schema information to safely read and write data without needing to rewrite underlying files.
snapshots
: Each snapshot references a specific set of data files that represent the state of the table at a point in time. This enables features like time travel.
partition-specs
: These define how the table is logically partitioned. Query engines leverage this information during planning to skip unnecessary partitions, greatly improving query performance.
By reading Iceberg metadata, query engines can efficiently prune partitions, load only the relevant snapshots, and fetch only the data files it needs, resulting in faster queries.
\nAlthough the Iceberg data and metadata files themselves live directly in object storage (like R2), the list of tables and pointers to the current metadata need to be tracked centrally by a data catalog.
Think of a data catalog as a library's index system. While books (your data) are physically distributed across shelves (object storage), the index provides a single source of truth about what books exist, their locations, and their latest editions. Without this index, readers (query engines) would waste time searching for books, might access outdated versions, or could accidentally shelve new books in ways that make them unfindable.
Similarly, data catalogs ensure consistent, coordinated access, allowing multiple query engines to safely read from and write to the same tables without conflicts or data corruption.
\nReady to try it out? Here’s a quick example using PyIceberg and Python to get you started. For a detailed step-by-step guide, check out our developer docs.
1. Enable R2 Data Catalog on your bucket:\n
\nnpx wrangler r2 bucket catalog enable my-bucket
\n Or use the Cloudflare dashboard: Navigate to R2 Object Storage > Settings > R2 Data Catalog and click Enable.
2. Create a Cloudflare API token with permissions for both R2 storage and the data catalog.
3. Install PyIceberg and PyArrow, then open a Python shell or notebook:
\npip install pyiceberg pyarrow
\n 4. Connect to the catalog and create a table:
\nimport pyarrow as pa\nfrom pyiceberg.catalog.rest import RestCatalog\n\n# Define catalog connection details (replace variables)\nWAREHOUSE = "<WAREHOUSE>"\nTOKEN = "<TOKEN>"\nCATALOG_URI = "<CATALOG_URI>"\n\n# Connect to R2 Data Catalog\ncatalog = RestCatalog(\n name="my_catalog",\n warehouse=WAREHOUSE,\n uri=CATALOG_URI,\n token=TOKEN,\n)\n\n# Create default namespace\ncatalog.create_namespace("default")\n\n# Create simple PyArrow table\ndf = pa.table({\n "id": [1, 2, 3],\n "name": ["Alice", "Bob", "Charlie"],\n})\n\n# Create an Iceberg table\ntable = catalog.create_table(\n ("default", "my_table"),\n schema=df.schema,\n)
\n You can now append more data or run queries, just as you would with any Apache Iceberg table.
\nWhile R2 Data Catalog is in open beta, there will be no additional charges beyond standard R2 storage and operations costs incurred by query engines accessing data. Storage pricing for buckets with R2 Data Catalog enabled remains the same as standard R2 buckets – \\$0.015 per GB-month. As always, egress directly from R2 buckets remains \\$0.
In the future, we plan to introduce pricing for catalog operations (e.g., creating tables, retrieving table metadata, etc.) and data compaction.
Below is our current thinking on future pricing. We’ll communicate more details around timing well before billing begins, so you can confidently plan your workloads.
We’re excited to see how you use R2 Data Catalog! If you’ve never worked with Iceberg – or even analytics data – before, we think this is the easiest way to get started.
Next on our roadmap is tackling compaction and table optimization. Query engines typically perform better when dealing with fewer, but larger data files. We will automatically re-write collections of small data files into larger files to deliver even faster query performance.
We’re also collaborating with the broad Apache Iceberg community to expand query-engine compatibility with the Iceberg REST Catalog spec.
We’d love your feedback. Join the Cloudflare Developer Discord to ask questions and share your thoughts during the public beta. For more details, examples, and guides, visit our developer documentation.
"],"published_at":[0,"2025-04-10T14:00+00:00"],"updated_at":[0,"2025-04-16T15:43:33.601Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3DLiKvo4JZk27ITbyBcZHm/b49fe94354e7f89b5e55cdf722fe2732/Feature_Image.png"],"tags":[1,[[0,{"id":[0,"2xCnBweKwOI3VXdYsGVbMe"],"name":[0,"Developer Week"],"slug":[0,"developer-week"]}],[0,{"id":[0,"419aJYheeNglKZlN8yunB6"],"name":[0,"R2"],"slug":[0,"r2"]}],[0,{"id":[0,"aHsK2p2ryRcfUSs4nMge0"],"name":[0,"Data Catalog"],"slug":[0,"data-catalog"]}],[0,{"id":[0,"7lB8a8hOPXzjt99X5Ye9wb"],"name":[0,"Storage"],"slug":[0,"storage"]}],[0,{"id":[0,"3JAY3z7p7An94s6ScuSQPf"],"name":[0,"Developer Platform"],"slug":[0,"developer-platform"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Phillip Jones"],"slug":[0,"phillip"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5KTNNpw9VHuwoZlwWsA7MN/c50f3f98d822a0fdce3196d7620d714e/phillip.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@akaphill"],"facebook":[0,null],"publiclyIndex":[0,true]}],[0,{"name":[0,"Garvit Gupta"],"slug":[0,"garvit-gupta"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5bHj4wTbnOq2jqn0UloCz8/061a096c9f0e2c7dafafa36eccff32d1/Garvit_Gupta.jpg"],"location":[0],"website":[0,"linkedin.com/in/garvitg/"],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Alex Graham"],"slug":[0,"alex-graham"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7JHQi5kSLLLe5Xv3UF3Wpu/78b42f6f5628c41a83ac08c537cda62f/_tmp_mini_magick20240416-2-nemxat.jpg"],"location":[0],"website":[0],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Garrett Gu"],"slug":[0,"garrett-gu"],"bio":[0,"Passionate about compilers, security, web, and audio."],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/73rzJLaDaPLOFGzOEVYmsX/d0d6b2e6f8d19e70bc3d35fb52ed682a/garrett-gu.jpg"],"location":[0,"Austin, TX"],"website":[0,"garrettgu.com"],"twitter":[0,null],"facebook":[0,null],"publiclyIndex":[0,true]}]]],"meta_description":[0,"R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/r2-data-catalog-public-beta"],"metadata":[0,{"title":[0,"R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees"],"description":[0,"R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3INh24m5qXpQPIsiXNpIzt/220e0b83803496d3724d5698cf1ce55e/OG_Share_2024__42_.png"]}],"publicly_index":[0,true]}],"translations":[0,{"posts.by":[0,"By"],"footer.gdpr":[0,"GDPR"],"lang_blurb1":[0,"This post is also available in {lang1}."],"lang_blurb2":[0,"This post is also available in {lang1} and {lang2}."],"lang_blurb3":[0,"This post is also available in {lang1}, {lang2} and {lang3}."],"footer.press":[0,"Press"],"header.title":[0,"The Cloudflare Blog"],"search.clear":[0,"Clear"],"search.filter":[0,"Filter"],"search.source":[0,"Source"],"footer.careers":[0,"Careers"],"footer.company":[0,"Company"],"footer.support":[0,"Support"],"footer.the_net":[0,"theNet"],"search.filters":[0,"Filters"],"footer.our_team":[0,"Our team"],"footer.webinars":[0,"Webinars"],"page.more_posts":[0,"More posts"],"posts.time_read":[0,"{time} min read"],"search.language":[0,"Language"],"footer.community":[0,"Community"],"footer.resources":[0,"Resources"],"footer.solutions":[0,"Solutions"],"footer.trademark":[0,"Trademark"],"header.subscribe":[0,"Subscribe"],"footer.compliance":[0,"Compliance"],"footer.free_plans":[0,"Free plans"],"footer.impact_ESG":[0,"Impact/ESG"],"posts.follow_on_X":[0,"Follow on X"],"footer.help_center":[0,"Help center"],"footer.network_map":[0,"Network Map"],"header.please_wait":[0,"Please Wait"],"page.related_posts":[0,"Related posts"],"search.result_stat":[0,"Results {search_range} of {search_total} for {search_keyword}"],"footer.case_studies":[0,"Case Studies"],"footer.connect_2024":[0,"Connect 2024"],"footer.terms_of_use":[0,"Terms of Use"],"footer.white_papers":[0,"White Papers"],"footer.cloudflare_tv":[0,"Cloudflare TV"],"footer.community_hub":[0,"Community Hub"],"footer.compare_plans":[0,"Compare plans"],"footer.contact_sales":[0,"Contact Sales"],"header.contact_sales":[0,"Contact Sales"],"header.email_address":[0,"Email Address"],"page.error.not_found":[0,"Page not found"],"footer.developer_docs":[0,"Developer docs"],"footer.privacy_policy":[0,"Privacy Policy"],"footer.request_a_demo":[0,"Request a demo"],"page.continue_reading":[0,"Continue reading"],"footer.analysts_report":[0,"Analyst reports"],"footer.for_enterprises":[0,"For enterprises"],"footer.getting_started":[0,"Getting Started"],"footer.learning_center":[0,"Learning Center"],"footer.project_galileo":[0,"Project Galileo"],"pagination.newer_posts":[0,"Newer Posts"],"pagination.older_posts":[0,"Older Posts"],"posts.social_buttons.x":[0,"Discuss on X"],"search.icon_aria_label":[0,"Search"],"search.source_location":[0,"Source/Location"],"footer.about_cloudflare":[0,"About Cloudflare"],"footer.athenian_project":[0,"Athenian Project"],"footer.become_a_partner":[0,"Become a partner"],"footer.cloudflare_radar":[0,"Cloudflare Radar"],"footer.network_services":[0,"Network services"],"footer.trust_and_safety":[0,"Trust & Safety"],"header.get_started_free":[0,"Get Started Free"],"page.search.placeholder":[0,"Search Cloudflare"],"footer.cloudflare_status":[0,"Cloudflare Status"],"footer.cookie_preference":[0,"Cookie Preferences"],"header.valid_email_error":[0,"Must be valid email."],"search.result_stat_empty":[0,"Results {search_range} of {search_total}"],"footer.connectivity_cloud":[0,"Connectivity cloud"],"footer.developer_services":[0,"Developer services"],"footer.investor_relations":[0,"Investor relations"],"page.not_found.error_code":[0,"Error Code: 404"],"search.autocomplete_title":[0,"Insert a query. Press enter to send"],"footer.logos_and_press_kit":[0,"Logos & press kit"],"footer.application_services":[0,"Application services"],"footer.get_a_recommendation":[0,"Get a recommendation"],"posts.social_buttons.reddit":[0,"Discuss on Reddit"],"footer.sse_and_sase_services":[0,"SSE and SASE services"],"page.not_found.outdated_link":[0,"You may have used an outdated link, or you may have typed the address incorrectly."],"footer.report_security_issues":[0,"Report Security Issues"],"page.error.error_message_page":[0,"Sorry, we can't find the page you are looking for."],"header.subscribe_notifications":[0,"Subscribe to receive notifications of new posts:"],"footer.cloudflare_for_campaigns":[0,"Cloudflare for Campaigns"],"header.subscription_confimation":[0,"Subscription confirmed. Thank you for subscribing!"],"posts.social_buttons.hackernews":[0,"Discuss on Hacker News"],"footer.diversity_equity_inclusion":[0,"Diversity, equity & inclusion"],"footer.critical_infrastructure_defense_project":[0,"Critical Infrastructure Defense Project"]}]}" ssr="" client="load" opts="{"name":"PostCard","value":true}" await-children="">2025-04-10
R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket....