Technological and computational advancements have led to a significant increase in the volume and complexity of geoscientific research data. This provides new opportunities for data-driven methods, but also poses new challenges for research data management. It is not only the increasing demands of the data that make data management more difficult; recent techniques, particularly machine learning approaches, require re-evaluating data handling practices, as these not only place higher storage demands and processing power, but also involve more extensive data processing, resulting in more intricate data workflows. To address these challenges, we present a modular pipeline framework designed to automate the key stages of the research data lifecycle (acquisition, transformation, processing and storage) of heterogeneous geomorphological and geochronological research data across disciplinary boundaries. This approach not only focuses on data storage but also provides an end-to-end data pipeline that bridges the gap between fieldwork, laboratory analysis, and scientific evaluation. By automating data workflows, the pipeline enables seamless flow of heterogeneous data from acquisition into a relational database and its subsequent transfer to an analytical database optimised for multidimensional queries. This dual-database architecture enables scalable storage, reproducible workflows, and the automation of complex analyses, including statistical and machine learning modelling. A case study in Western Romania illustrates the application of our approach in an interdisciplinary geoarchaeological project, focusing on integrating diverse, high-dimensional sedimentological datasets. Our work shows that research data pipelines can be essential in promoting reproducibility and replication in geoscience.