Economic Data Engineering

Economic data engineering deliberately designs novel forms of data to solve fundamental identification problems associated with economic models of choice. I outline three diverse applications: to the economics of information; to life-cycle employment, earnings, and spending; and to public policy analysis. In all three cases one and the same fundamental identification problem is driving data innovation: that of separately identifying appropriately rich preferences and beliefs. In addition to presenting these conceptually linked examples, I provide a general overview of the engineering process, outline important next steps, and highlight larger opportunities.

Deep thanks to Daniel Martin, Stefan Bucher, and Soren Leth-Petersen for helping me conceptualize the broad sweep and implement the fine details of this paper, four anonymous referees and the editor for helping me sharpen ideas and focus, as well as Ernst Fehr, Mike Woodford, Markus Reinhard, Danny Goroff, Ruben Garcia-Santos and participants in the Sloan-NOMIS Program on the Cognitive Foundations of Economic Behavior for supporting the broader push toward interdisciplinary research. The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.


Download Citation Data

More from NBER

In addition to working papers , the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter , the NBER Digest , the Bulletin on Retirement and Disability , the Bulletin on Health , and the Bulletin on Entrepreneurship  — as well as online conference reports , video lectures , and interviews .

15th Annual Feldstein Lecture, Mario Draghi, "The Next Flight of the Bumblebee: The Path to Common Fiscal Policy in the Eurozone cover slide

data science Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Assessing the effects of fuel energy consumption, foreign direct investment and GDP on CO2 emission: New data science evidence from Europe & Central Asia

Documentation matters: human-centered ai system to assist data science code documentation in computational notebooks.

Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.

Data science in the business environment: Insight management for an Executive MBA

Adventures in financial data science, gecoagent: a conversational agent for empowering genomic data extraction and analysis.

With the availability of reliable and low-cost DNA sequencing, human genomics is relevant to a growing number of end-users, including biologists and clinicians. Typical interactions require applying comparative data analysis to huge repositories of genomic information for building new knowledge, taking advantage of the latest findings in applied genomics for healthcare. Powerful technology for data extraction and analysis is available, but broad use of the technology is hampered by the complexity of accessing such methods and tools. This work presents GeCoAgent, a big-data service for clinicians and biologists. GeCoAgent uses a dialogic interface, animated by a chatbot, for supporting the end-users’ interaction with computational tools accompanied by multi-modal support. While the dialogue progresses, the user is accompanied in extracting the relevant data from repositories and then performing data analysis, which often requires the use of statistical methods or machine learning. Results are returned using simple representations (spreadsheets and graphics), while at the end of a session the dialogue is summarized in textual format. The innovation presented in this article is concerned with not only the delivery of a new tool but also our novel approach to conversational technologies, potentially extensible to other healthcare domains or to general data science.

Differentially Private Medical Texts Generation Using Generative Neural Networks

Technological advancements in data science have offered us affordable storage and efficient algorithms to query a large volume of data. Our health records are a significant part of this data, which is pivotal for healthcare providers and can be utilized in our well-being. The clinical note in electronic health records is one such category that collects a patient’s complete medical information during different timesteps of patient care available in the form of free-texts. Thus, these unstructured textual notes contain events from a patient’s admission to discharge, which can prove to be significant for future medical decisions. However, since these texts also contain sensitive information about the patient and the attending medical professionals, such notes cannot be shared publicly. This privacy issue has thwarted timely discoveries on this plethora of untapped information. Therefore, in this work, we intend to generate synthetic medical texts from a private or sanitized (de-identified) clinical text corpus and analyze their utility rigorously in different metrics and levels. Experimental results promote the applicability of our generated data as it achieves more than 80\% accuracy in different pragmatic classification problems and matches (or outperforms) the original text data.

Impact on Stock Market across Covid-19 Outbreak

Abstract: This paper analysis the impact of pandemic over the global stock exchange. The stock listing values are determined by variety of factors including the seasonal changes, catastrophic calamities, pandemic, fiscal year change and many more. This paper significantly provides analysis on the variation of listing price over the world-wide outbreak of novel corona virus. The key reason to imply upon this outbreak was to provide notion on underlying regulation of stock exchanges. Daily closing prices of the stock indices from January 2017 to January 2022 has been utilized for the analysis. The predominant feature of the research is to analyse the fact that does global economy downfall impacts the financial stock exchange. Keywords: Stock Exchange, Matplotlib, Streamlit, Data Science, Web scrapping.

Information Resilience: the nexus of responsible and agile approaches to information use

AbstractThe appetite for effective use of information assets has been steadily rising in both public and private sector organisations. However, whether the information is used for social good or commercial gain, there is a growing recognition of the complex socio-technical challenges associated with balancing the diverse demands of regulatory compliance and data privacy, social expectations and ethical use, business process agility and value creation, and scarcity of data science talent. In this vision paper, we present a series of case studies that highlight these interconnected challenges, across a range of application areas. We use the insights from the case studies to introduce Information Resilience, as a scaffold within which the competing requirements of responsible and agile approaches to information use can be positioned. The aim of this paper is to develop and present a manifesto for Information Resilience that can serve as a reference for future research and development in relevant areas of responsible data management.

qEEG Analysis in the Diagnosis of Alzheimers Disease; a Comparison of Functional Connectivity and Spectral Analysis

Alzheimers disease (AD) is a brain disorder that is mainly characterized by a progressive degeneration of neurons in the brain, causing a decline in cognitive abilities and difficulties in engaging in day-to-day activities. This study compares an FFT-based spectral analysis against a functional connectivity analysis based on phase synchronization, for finding known differences between AD patients and Healthy Control (HC) subjects. Both of these quantitative analysis methods were applied on a dataset comprising bipolar EEG montages values from 20 diagnosed AD patients and 20 age-matched HC subjects. Additionally, an attempt was made to localize the identified AD-induced brain activity effects in AD patients. The obtained results showed the advantage of the functional connectivity analysis method compared to a simple spectral analysis. Specifically, while spectral analysis could not find any significant differences between the AD and HC groups, the functional connectivity analysis showed statistically higher synchronization levels in the AD group in the lower frequency bands (delta and theta), suggesting that the AD patients brains are in a phase-locked state. Further comparison of functional connectivity between the homotopic regions confirmed that the traits of AD were localized in the centro-parietal and centro-temporal areas in the theta frequency band (4-8 Hz). The contribution of this study is that it applies a neural metric for Alzheimers detection from a data science perspective rather than from a neuroscience one. The study shows that the combination of bipolar derivations with phase synchronization yields similar results to comparable studies employing alternative analysis methods.

Big Data Analytics for Long-Term Meteorological Observations at Hanford Site

A growing number of physical objects with embedded sensors with typically high volume and frequently updated data sets has accentuated the need to develop methodologies to extract useful information from big data for supporting decision making. This study applies a suite of data analytics and core principles of data science to characterize near real-time meteorological data with a focus on extreme weather events. To highlight the applicability of this work and make it more accessible from a risk management perspective, a foundation for a software platform with an intuitive Graphical User Interface (GUI) was developed to access and analyze data from a decommissioned nuclear production complex operated by the U.S. Department of Energy (DOE, Richland, USA). Exploratory data analysis (EDA), involving classical non-parametric statistics, and machine learning (ML) techniques, were used to develop statistical summaries and learn characteristic features of key weather patterns and signatures. The new approach and GUI provide key insights into using big data and ML to assist site operation related to safety management strategies for extreme weather events. Specifically, this work offers a practical guide to analyzing long-term meteorological data and highlights the integration of ML and classical statistics to applied risk and decision science.

Export Citation Format

Share document.

Book cover

  • Conference proceedings
  • © 2022

Data Engineering for Smart Systems

Proceedings of SSIC 2021

  • Priyadarsi Nanda 0 ,
  • Vivek Kumar Verma 1 ,
  • Sumit Srivastava 2 ,
  • Rohit Kumar Gupta 3 ,
  • Arka Prokash Mazumdar 4

School of Electrical and Data Engineering, University of Technology Sydney, Sydney, Australia

You can also search for this editor in PubMed   Google Scholar

Manipal University Jaipur, Jaipur, India

Department of computer science and engineering, malviya national institute of technology, jaipur, india.

Presents recent research in the field of data engineering

Discusses the outcomes of SSIC 2021, held in Manipal University Jaipur, India

Serves as a reference guide for researchers and practitioners in academia and industry

Part of the book series: Lecture Notes in Networks and Systems (LNNS, volume 238)

43k Accesses

90 Citations

12 Altmetric

  • Table of contents

About this book

Editors and affiliations, about the editors, bibliographic information.

  • Publish with us

Buying options

  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (63 papers)

Front matter, using machine learning, image processing and neural networks to sense bullying in k-12 schools: enhanced.

  • Lalit Kumar, Palash Goyal, Karan Malik, Rishav Kumar, Dhruv Shrivastav

Feature-Based Comparative Study of Machine Learning Algorithms for Credibility Analysis of Online Social Media Content

  • Utkarsh Sharma, Shishir Kumar

Smart Support System for Navigation of Visually Challenged Person Using IOT

  • Tuhin Utsab Paul, Aninda Ghosh

Identity-Based Video Summarization

  • Soummya Kulkarni, Darshana Bhagit, Masooda Modak

Security Testing for Blockchain Enabled IoT System

  • A. B. Yugakiruthika, A. Malini

Two-Dimensional Software Reliability Model with Considering the Uncertainty in Operating Environment and Predictive Analysis

  • Ramgopal Dhaka, Bhoopendra Pachauri, Anamika Jain

Object Recognition in a Cluttered Scene

  • Rashmee Shrestha, Mandeep Kaur, Nitin Rakesh, Parma Nand

Breast Cancer Prediction on BreakHis Dataset Using Deep CNN and Transfer Learning Model

  • Pinky Agarwal, Anju Yadav, Pratistha Mathur

A Comprehensive Tool Survey for Blockchain to IoT Applications

Hybrid ensemble for fake news detection: an attempt.

  • Lovedeep Singh

Detectıon of Abnormal Activity at College Entrance Through Video Surveillance

  • Lalit Damahe, Saurabh Diwe, Shailesh Kamble, Sandeep Kakde, Praful Barekar

Audio Peripheral Volume Automation Based on the Surrounding Environment and Individual Human Listening Traits

  • Adit Doshi, Helly Patel, Rikin Patel, Brijesh Satasiya, Muskan Kapadia, Nirali Nanavati

A Survey: Accretion in Linguistic Classification of Indian Languages

  • Dipjayaben Patel

The Positive Electronic Word of Mouth: A Research Based on the Relational Mediator Meta-Analytic Framework in Electronic Marketplace

  • Bui Thanh Khoa

A Review: Web Content Mining Techniques

  • Priyanka Shah, Hardik B. Pandit

GPS-Free Localization in Vehicular Networks Using Directional Antennas

  • Parveen, Sushil Kumar, Rishipal Singh

An Improved Scheme in AODV Routing Protocol for Enhancement of QoS in MANET

  • Amit Kumar Bairwa, Sandeep Joshi

Knowledge Management Framework for Sustainability and Resilience in Next-Gen e-Governance

  • Iqbal Hasan, Sam Rizvi

Enriching WordNet with Subject Specific Out of Vocabulary Terms Using Existing Ontology

  • Kanika, Shampa Chakraverty, Pinaki Chakraborty, Aditya Aggarwal, Manan Madan, Gaurav Gupta
  • SSIC Proceedings
  • Data Engineering
  • Data Analysis
  • Data Innovation and Management
  • Data Network

Priyadarsi Nanda

Vivek Kumar Verma, Sumit Srivastava, Rohit Kumar Gupta

Arka Prokash Mazumdar

Book Title : Data Engineering for Smart Systems

Book Subtitle : Proceedings of SSIC 2021

Editors : Priyadarsi Nanda, Vivek Kumar Verma, Sumit Srivastava, Rohit Kumar Gupta, Arka Prokash Mazumdar

Series Title : Lecture Notes in Networks and Systems


Publisher : Springer Singapore

eBook Packages : Intelligent Technologies and Robotics , Intelligent Technologies and Robotics (R0)

Copyright Information : The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022

Softcover ISBN : 978-981-16-2640-1 Published: 14 November 2021

eBook ISBN : 978-981-16-2641-8 Published: 13 November 2021

Series ISSN : 2367-3370

Series E-ISSN : 2367-3389

Edition Number : 1

Number of Pages : XXII, 681

Number of Illustrations : 79 b/w illustrations, 197 illustrations in colour

Topics : Data Engineering , Systems and Data Security , Data Structures and Information Theory , Artificial Intelligence , Big Data

Policies and ethics

  • Find a journal
  • Track your research

Software Engineering for Data Analytics

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Help | Advanced Search

Electrical Engineering and Systems Science > Signal Processing

Title: guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram.

Abstract: Electrocardiograms (ECG) are widely employed as a diagnostic tool for monitoring electrical signals originating from a heart. Recent machine learning research efforts have focused on the application of screening various diseases using ECG signals. However, adapting to the application of screening disease is challenging in that labeled ECG data are limited. Achieving general representation through self-supervised learning (SSL) is a well-known approach to overcome the scarcity of labeled data; however, a naive application of SSL to ECG data, without considering the spatial-temporal relationships inherent in ECG signals, may yield suboptimal results. In this paper, we introduce ST-MEM (Spatio-Temporal Masked Electrocardiogram Modeling), designed to learn spatio-temporal features by reconstructing masked 12-lead ECG data. ST-MEM outperforms other SSL baseline methods in various experimental settings for arrhythmia classification tasks. Moreover, we demonstrate that ST-MEM is adaptable to various lead combinations. Through quantitative and qualitative analysis, we show a spatio-temporal relationship within ECG data.

Submission history

Access paper:.

  • Download PDF
  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .


  1. 😍 Database for research papers. Research paper database. 2019-01-28

    research paper on data engineering

  2. FREE 42+ Research Paper Examples in PDF

    research paper on data engineering

  3. Analysis In A Research Paper

    research paper on data engineering


    research paper on data engineering

  5. Reflection essay: Data analysis section of research paper example

    research paper on data engineering

  6. 📗 Research Paper on Data Visualization Tools and Programming for Data

    research paper on data engineering


  1. 3 Websites For Datasets & Research Papers 😮📜 #datascience #artificialintelligence #data #research

  2. do bindings in F#

  3. F# Tutorial: Using the List.fold function

  4. Engineering Drawing or Engineering Graphics Important

  5. BEST AI TOOLS FOR RESEARCH PAPER WRITING, Assignment, Article review and literature 2023 in Amharic

  6. Data Engineering has a Requirements Problem


  1. 1063 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATA ENGINEERING. Find methods information, sources, references or conduct a literature review...

  2. [2102.11447] Data Engineering for Everyone

    Data engineering is one of the fastest-growing fields within machine learning (ML). As ML becomes more common, the appetite for data grows more ravenous. But ML requires more data than individual teams of data engineers can readily produce, which presents a severe challenge to ML deployment at scale. Much like the software-engineering revolution, where mass adoption of open-source software ...

  3. (PDF) Evolving Paradigms of Data Engineering in the Modern Era

    Dec 2021 Thokozani Mtshali View Show abstract Big Data Analytics for Healthcare Industry: Impact, Applications, and Tools Article Full-text available

  4. IEEE Transactions on Knowledge and Data Engineering

    Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support

  5. Data-driven engineering design: A systematic review ...

    1. Introduction 1.1. Motivation Currently, we are in the process of intensive digitalisation, and consequently, the data generated by the millions of items that surround us is constantly increasing. The need to assess and exploit these data has resulted in the rapid development of methods and technologies known as big data analytics (BDA).

  6. Home

    <?xml version="1.0" encoding="UTF-8"?> Skip to main content Find a journal Publish with us Track your research (this opens in a new tab) Aims and scope is a peer-reviewed, open access journal focusing on theoretical background and advanced engineering approaches in data science and engineering.

  7. Articles

    Showing 1-50 of 239 articles Efficient Top- k Frequent Itemset Mining on Massive Data Xiaolong Wan Xixian Han Research Paper Open access 06 February 2024 Where To Go at the Next Timestamp Jiaqi Duan Xiangfu Meng Guihong Liu Research Paper Open access 28 January 2024 Construct and Query A Fine-Grained Geospatial Knowledge Graph Bo Wei Xi Guo

  8. Economic Data Engineering

    Andrew Caplin. Working Paper 29378. DOI 10.3386/w29378. Issue Date October 2021. Economic data engineering deliberately designs novel forms of data to solve fundamental identification problems associated with economic models of choice. I outline three diverse applications: to the economics of information; to life-cycle employment, earnings, and ...

  9. Data, Engineering and Applications

    About this book. The book contains select proceedings of the 3rd International Conference on Data, Engineering, and Applications (IDEA 2021). It includes papers from experts in industry and academia that address state-of-the-art research in the areas of big data, data mining, machine learning, data science, and their associated learning systems ...

  10. Ten Research Challenge Areas in Data Science

    Abstract. To drive progress in the field of data science, we propose 10 challenge areas for the research community to pursue. Since data science is broad, with methods drawing from computer science, statistics, and other disciplines, and with applications appearing in all sectors, these challenge areas speak to the breadth of issues spanning science, technology, and society.

  11. data science Latest Research Papers

    Assessing the effects of fuel energy consumption, foreign direct investment and GDP on CO2 emission: New data science evidence from Europe & Central Asia. Fuel . 10.1016/j.fuel.2021.123098 . 2022 . Vol 314 . pp. 123098. Author (s): Muhammad Mohsin . Sobia Naseem .

  12. (PDF) The Essence of Data Engineering

    This paper provides a brief introduction to data engineering. Discover the world's research 2.3+ billion citations Content uploaded by Kelechi Eze Author content Content may be subject to...

  13. PDF Software Engineering for Data Analytics

    I summarize findings from empirical studies of professional data scientists in collab-oration with Microsoft Research.3,4 In my opinion, key differences ex-ist between traditional software development versus data-centric de-velopment, which makes it hard for software engineers to debug and test data-centric software or AI/ML-based software systems.

  14. Data-Centric Engineering

    Data-Centric Engineering (DCE) is a peer-reviewed open-access journal dedicated to the transformative impact of data science for research and practice across all areas of engineering. Articles explore the benefits of data science methods and models for improving the reliability, resilience, safety, efficiency and usability of engineered systems.

  15. PDF SE4DA--Software Engineering for Data Analytics

    only 13 out of 285 papers (4% of research papers in ASE 2016-2019) focused on improving SE for DA (Figure1). In this article, I hope to make a case that we, the software engineering research community, should expand its research scope to extend and adapt existing software engineering to meet the new demands of data-centric software develop-

  16. Data Engineering for Smart Systems

    About this book. This book features original papers from the 3rd International Conference on Smart IoT Systems: Innovations and Computing (SSIC 2021), organized by Manipal University, Jaipur, India, during January 22-23, 2021. It discusses scientific works related to data engineering in the context of computational collective intelligence ...

  17. Software Engineering for Data Analytics

    Abstract: We are at an inflection point where software engineering meets the data-centric world of big data, machine learning, and artificial intelligence. In this article, I summarize findings from studies of professional data scientists and discuss my perspectives on open research problems to improve data-centric software development.

  18. Data Engineering for Data Analytics: A ...

    This paper advocates for a standardized data engineering approach for data science and presents a layered architecture for a data processing pipeline (DPP), which provides a comprehensive conceptual view of DPPs, which next enables the semi-automation of the logical and physical designs of such D PPs. 7 PDF

  19. Data Science for Advancing Environmental Science, Engineering, and

    Data Science for Advancing Environmental Science, Engineering, and Technology: Upcoming Special and Virtual Issues in ES&T and ES&T Letters Cite This: Environ. Sci. Technol. 2022, 56, 9827−9828 Read Online ACCESS Metrics & More Article Recommendations A defining characteristic of environmental science and engineering research is the ...

  20. (PDF) Knowledge and data engineering

    Research on knowledge and data engineering is examined with respect to programmability and representation, design tradeoffs, algorithms and control, and emerging technologies. Future challenges...

  21. DKE

    DKE covers the following topics: 1. Representation and Manipulation of Data & Knowledge: Conceptual data models. Knowledge representation techniques. Data/knowledge manipulation languages and techniques. 2.

  22. Economic Data Engineering by Andrew Caplin :: SSRN

    Abstract. Economic data engineering deliberately designs novel forms of data to solve fundamental identification problems associated with economic models of choice. I outline three diverse applications: to the economics of information; to life-cycle employment, earnings, and spending; and to public policy analysis. In all three cases one and ...

  23. PDF Snowflake for Data Engineering

    Learn how Snowflake for Data Engineering can help you easily ingest, transform, and deliver data for up-to-the-moment insight. This PDF guide covers the key features, benefits, and use cases of Snowflake's cloud data platform for data engineering tasks. Discover how Snowflake can simplify and accelerate your data pipelines, reduce costs and complexity, and enable data-driven decision making.

  24. [2402.09450] Guiding Masked Representation Learning to Capture Spatio

    Electrocardiograms (ECG) are widely employed as a diagnostic tool for monitoring electrical signals originating from a heart. Recent machine learning research efforts have focused on the application of screening various diseases using ECG signals. However, adapting to the application of screening disease is challenging in that labeled ECG data are limited. Achieving general representation ...

  25. Gartner Emerging Technologies and Trends Impact Radar for 2024

    Use this year's Gartner Emerging Tech Impact Radar to: ☑️Enhance your competitive edge in the smart world ☑️Prioritize prevalent and impactful GenAI use cases that already deliver real value to users ☑️Balance stimulating growth and mitigating risk ☑️Identify relevant emerging technologies that support your strategic product roadmap Explore all 30 technologies and trends: www ...