4th Table Representation Learning Workshop @ ACL 2025

July 31st 2025, Vienna, Austria.



Mailinglist: signup here!
Follow on Bluesky: @trl-research
New: Join the TRL Discord!: TRL Discord

Tables are a promising modality for representation learning and generative models with too much application potential to ignore. However, tables have long been overlooked despite their dominant presence in the data landscape, e.g. data management and analysis pipelines. The majority of datasets in Google Dataset Search, for example, resembles typical tabular file formats like CSVs. Similarly, the top-3 most-used database management systems are all intended for relational data. Representation learning for tables, possibly combined with other modalities such as code and text, has shown impressive performance for tasks like semantic parsing, question answering, table understanding, data preparation, and data analysis (e.g. text-to-sql). The pre-training paradigm was shown to be effective for tabular ML (classification/regression) as well. More recently, we also observe promising potential in applying and enhancing LLMs in the domain of structured data to improve how we process and derive insights from structured data.

The Table Representation Learning (TRL) workshop is the premier venue in this emerging research area and has three main goals:

  • (1) Motivate structured data (e.g. tables) as a primary modality for representation and generative models and advance the area further.
  • (2) Showcase impactful applications of pretrained table models and identify open challenges for future research, with a particular focus on progress in NLP for this edition at ACL in 2025.
  • (3) Foster discussion and collaboration across the NLP, ML, IR and DB communities.

When: July 31st 2025 (Tentative)
Where: Vienna, Austria
Submit: 15 April 2025, TBD


Call for Papers


Important Dates


Submission Open February 1, 2025
Submission Deadline April 15th, 2025 (11:59PM AoE)
Notifications May 15th, 2025 (11:59PM AoE)
Camera-ready May 30th, 2025 (11:59PM AoE)
Slides for contributed talks June 30th, 2025 (11:59PM AoE)
Video pitches for posters (optional) June 30th, 2025 (11:59PM AoE)
Workshop Date July 31st, 2025 (Tentative)

Scope

We invite submissions on any of, or related to, the following topics on machine learning for tabular data:

  • Representation Learning for (semi-)Structured Data such as spreadsheets, tables, and full relational databases. Example contributions are new model architectures, data encoding techniques, tailored tokenization methods, pre-training and fine-tuning techniques, etc.
  • Generative Models and LLMs for Structured Data such as Large Language Models (LLMs) and diffusion models, and specialized techniques for prompt engineering, single-task and multi-task fine-tuning, LLM-driven interfaces and multi-agent systems, retrieval-augmented generation, etc.
  • Multimodal Learning where structured data is jointly embedded or combined with other modalities such as text, images, and code (e.g., SQL), knowledge graphs, visualizations/images.
  • Applications of TRL models of table representations for tasks like data preparation (e.g. data cleaning, validation, integration, cataloging, feature engineering), retrieval (e.g. data search, fact-checking/QA, KG alignment), analysis (e.g. text-to-SQL and visualization), tabular data generation, (end-to-end) tabular machine learning, table extraction (e.g. parsers/extraction for unstructured data), and query optimization (e.g. cardinality estimation).
  • Challenges of TRL models in production Work addressing the challenges of maintaining and managing TRL models in fast-evolving contexts, e.g., data updating, error correction, and monitoring, handling data privacy, personalization performance, etc.
  • Domain-specific challenges for learned table models often arise in domains such as enterprise, finance, medical, law. These challenges pertain to table content, table structure, privacy, security limitations, and other factors that necessitate tailored solutions.
  • Benchmarks, analyses, and datasets for TRL including assessing LLMs and other generative models as base models versus alternative approaches, analysis of model robustness with respect to large, messy, and heterogeneous tabular data, etc.
  • Other contributions such as surveys, demonstrations, visions, and reflections on table representation learning and generative models for structured data.

Organization

Workshop Chairs


Qian Liu
ByteDance

Wenhu Chen
University of Waterloo
Filip Gralinski
Snowflake

Huan Sun
The Ohio State University




Program


Invited Speakers





Submission Guidelines

Submission link

Submit your (anonymized) paper through OpenReview at: TBA
Please be aware that accepted papers are expected to be presented at the workshop in-person.

Formatting guidelines

The workshop accepts regular research papers and industrial papers of the following types:
  • Short paper: 4 pages + references and appendix.
  • Regular paper: 8 pages + references and appendix.


Submissions should be anonymized and follow the ACL style files, but can exclude the checklist. Non-anonymous preprints are no problem, and artifacts do not have to be anonymized. Just submitting the paper without author names/affiliations is sufficient. Supplementary material, if any, may be added in the appendix. The footer of accepted papers should state “Table Representation Learning Workshop at ACL 2025”. We expect authors to adopt an inclusive and diverse writing style. The “Diversity and Inclusion in Writing” guide by the DE&I in DB Conferences effort is a good resource.

Review process

Papers will receive light reviews in a double-anonymous manner. All accepted submissions will be published on the website and made public on OpenReview but the workshop is non-archival (i.e. without proceedings).

Novelty and conflicts

The workshop cannot accept submissions that have been published at ACL or other machine learning venues as-is, but we do invite relevant papers from the main conference (ACL) to be submitted to the workshop as 4-page short papers. We also welcome submissions that have been published in, for example, data management or natural language processing venues. We rely on OpenReview for handling conflicts, so please ensure that the conflicts in every author's OpenReview profile are complete, in particular, with respect to the organization and program committees.

Camera-ready instructions

Camera-ready papers are expected to express the authors and affiliations on the first page, and state "Table Representation Learning Workshop at ACL 2025'' in the footer. The camera-ready version may exceed the page limit for acknowledgements or small content changes, but revision is not required (for short papers: please be aware of novelty requirements of archival venues, e.g. SIGMOD, CVPR). The camera-ready version should be submitted through OpenReview (submission -> edit -> revision), and will be published on OpenReview and this website. Please make sure that all meta-data is correct as well, as it will be imported to the ACL website.