Economic Data Engineering
Economic data engineering deliberately designs novel forms of data to solve fundamental identification problems associated with economic models of choice. I outline three diverse applications: to the economics of information; to life-cycle employment, earnings, and spending; and to public policy analysis. In all three cases one and the same fundamental identification problem is driving data innovation: that of separately identifying appropriately rich preferences and beliefs. In addition to presenting these conceptually linked examples, I provide a general overview of the engineering process, outline important next steps, and highlight larger opportunities.
Deep thanks to Daniel Martin, Stefan Bucher, and Soren Leth-Petersen for helping me conceptualize the broad sweep and implement the fine details of this paper, four anonymous referees and the editor for helping me sharpen ideas and focus, as well as Ernst Fehr, Mike Woodford, Markus Reinhard, Danny Goroff, Ruben Garcia-Santos and participants in the Sloan-NOMIS Program on the Cognitive Foundations of Economic Behavior for supporting the broader push toward interdisciplinary research. The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.
MARC RIS BibTeΧ
Download Citation Data
More from NBER
In addition to working papers , the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter , the NBER Digest , the Bulletin on Retirement and Disability , the Bulletin on Health , and the Bulletin on Entrepreneurship — as well as online conference reports , video lectures , and interviews .
data science Recently Published Documents
- Latest Documents
- Most Cited Documents
- Contributed Authors
- Related Sources
- Related Keywords
Assessing the effects of fuel energy consumption, foreign direct investment and GDP on CO2 emission: New data science evidence from Europe & Central Asia
Documentation matters: human-centered ai system to assist data science code documentation in computational notebooks.
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.
Data science in the business environment: Insight management for an Executive MBA
Adventures in financial data science, gecoagent: a conversational agent for empowering genomic data extraction and analysis.
With the availability of reliable and low-cost DNA sequencing, human genomics is relevant to a growing number of end-users, including biologists and clinicians. Typical interactions require applying comparative data analysis to huge repositories of genomic information for building new knowledge, taking advantage of the latest findings in applied genomics for healthcare. Powerful technology for data extraction and analysis is available, but broad use of the technology is hampered by the complexity of accessing such methods and tools. This work presents GeCoAgent, a big-data service for clinicians and biologists. GeCoAgent uses a dialogic interface, animated by a chatbot, for supporting the end-users’ interaction with computational tools accompanied by multi-modal support. While the dialogue progresses, the user is accompanied in extracting the relevant data from repositories and then performing data analysis, which often requires the use of statistical methods or machine learning. Results are returned using simple representations (spreadsheets and graphics), while at the end of a session the dialogue is summarized in textual format. The innovation presented in this article is concerned with not only the delivery of a new tool but also our novel approach to conversational technologies, potentially extensible to other healthcare domains or to general data science.
Differentially Private Medical Texts Generation Using Generative Neural Networks
Technological advancements in data science have offered us affordable storage and efficient algorithms to query a large volume of data. Our health records are a significant part of this data, which is pivotal for healthcare providers and can be utilized in our well-being. The clinical note in electronic health records is one such category that collects a patient’s complete medical information during different timesteps of patient care available in the form of free-texts. Thus, these unstructured textual notes contain events from a patient’s admission to discharge, which can prove to be significant for future medical decisions. However, since these texts also contain sensitive information about the patient and the attending medical professionals, such notes cannot be shared publicly. This privacy issue has thwarted timely discoveries on this plethora of untapped information. Therefore, in this work, we intend to generate synthetic medical texts from a private or sanitized (de-identified) clinical text corpus and analyze their utility rigorously in different metrics and levels. Experimental results promote the applicability of our generated data as it achieves more than 80\% accuracy in different pragmatic classification problems and matches (or outperforms) the original text data.
Impact on Stock Market across Covid-19 Outbreak
Abstract: This paper analysis the impact of pandemic over the global stock exchange. The stock listing values are determined by variety of factors including the seasonal changes, catastrophic calamities, pandemic, fiscal year change and many more. This paper significantly provides analysis on the variation of listing price over the world-wide outbreak of novel corona virus. The key reason to imply upon this outbreak was to provide notion on underlying regulation of stock exchanges. Daily closing prices of the stock indices from January 2017 to January 2022 has been utilized for the analysis. The predominant feature of the research is to analyse the fact that does global economy downfall impacts the financial stock exchange. Keywords: Stock Exchange, Matplotlib, Streamlit, Data Science, Web scrapping.
Information Resilience: the nexus of responsible and agile approaches to information use
AbstractThe appetite for effective use of information assets has been steadily rising in both public and private sector organisations. However, whether the information is used for social good or commercial gain, there is a growing recognition of the complex socio-technical challenges associated with balancing the diverse demands of regulatory compliance and data privacy, social expectations and ethical use, business process agility and value creation, and scarcity of data science talent. In this vision paper, we present a series of case studies that highlight these interconnected challenges, across a range of application areas. We use the insights from the case studies to introduce Information Resilience, as a scaffold within which the competing requirements of responsible and agile approaches to information use can be positioned. The aim of this paper is to develop and present a manifesto for Information Resilience that can serve as a reference for future research and development in relevant areas of responsible data management.
qEEG Analysis in the Diagnosis of Alzheimers Disease; a Comparison of Functional Connectivity and Spectral Analysis
Alzheimers disease (AD) is a brain disorder that is mainly characterized by a progressive degeneration of neurons in the brain, causing a decline in cognitive abilities and difficulties in engaging in day-to-day activities. This study compares an FFT-based spectral analysis against a functional connectivity analysis based on phase synchronization, for finding known differences between AD patients and Healthy Control (HC) subjects. Both of these quantitative analysis methods were applied on a dataset comprising bipolar EEG montages values from 20 diagnosed AD patients and 20 age-matched HC subjects. Additionally, an attempt was made to localize the identified AD-induced brain activity effects in AD patients. The obtained results showed the advantage of the functional connectivity analysis method compared to a simple spectral analysis. Specifically, while spectral analysis could not find any significant differences between the AD and HC groups, the functional connectivity analysis showed statistically higher synchronization levels in the AD group in the lower frequency bands (delta and theta), suggesting that the AD patients brains are in a phase-locked state. Further comparison of functional connectivity between the homotopic regions confirmed that the traits of AD were localized in the centro-parietal and centro-temporal areas in the theta frequency band (4-8 Hz). The contribution of this study is that it applies a neural metric for Alzheimers detection from a data science perspective rather than from a neuroscience one. The study shows that the combination of bipolar derivations with phase synchronization yields similar results to comparable studies employing alternative analysis methods.
Big Data Analytics for Long-Term Meteorological Observations at Hanford Site
A growing number of physical objects with embedded sensors with typically high volume and frequently updated data sets has accentuated the need to develop methodologies to extract useful information from big data for supporting decision making. This study applies a suite of data analytics and core principles of data science to characterize near real-time meteorological data with a focus on extreme weather events. To highlight the applicability of this work and make it more accessible from a risk management perspective, a foundation for a software platform with an intuitive Graphical User Interface (GUI) was developed to access and analyze data from a decommissioned nuclear production complex operated by the U.S. Department of Energy (DOE, Richland, USA). Exploratory data analysis (EDA), involving classical non-parametric statistics, and machine learning (ML) techniques, were used to develop statistical summaries and learn characteristic features of key weather patterns and signatures. The new approach and GUI provide key insights into using big data and ML to assist site operation related to safety management strategies for extreme weather events. Specifically, this work offers a practical guide to analyzing long-term meteorological data and highlights the integration of ML and classical statistics to applied risk and decision science.
Export Citation Format
- Conference proceedings
- © 2022
Data Engineering for Smart Systems
Proceedings of SSIC 2021
- Priyadarsi Nanda 0 ,
- Vivek Kumar Verma 1 ,
- Sumit Srivastava 2 ,
- Rohit Kumar Gupta 3 ,
- Arka Prokash Mazumdar 4
School of Electrical and Data Engineering, University of Technology Sydney, Sydney, Australia
You can also search for this editor in PubMed Google Scholar
Manipal University Jaipur, Jaipur, India
Department of computer science and engineering, malviya national institute of technology, jaipur, india.
Presents recent research in the field of data engineering
Discusses the outcomes of SSIC 2021, held in Manipal University Jaipur, India
Serves as a reference guide for researchers and practitioners in academia and industry
Part of the book series: Lecture Notes in Networks and Systems (LNNS, volume 238)
- Table of contents
About this book
Editors and affiliations, about the editors, bibliographic information.
- Publish with us
- Available as EPUB and PDF
- Read on any device
- Instant download
- Own it forever
- Compact, lightweight edition
- Dispatched in 3 to 5 business days
- Free shipping worldwide - see info
Tax calculation will be finalised at checkout
Other ways to access
This is a preview of subscription content, log in via an institution to check for access.
Table of contents (63 papers)
Front matter, using machine learning, image processing and neural networks to sense bullying in k-12 schools: enhanced.
- Lalit Kumar, Palash Goyal, Karan Malik, Rishav Kumar, Dhruv Shrivastav
Feature-Based Comparative Study of Machine Learning Algorithms for Credibility Analysis of Online Social Media Content
- Utkarsh Sharma, Shishir Kumar
Smart Support System for Navigation of Visually Challenged Person Using IOT
- Tuhin Utsab Paul, Aninda Ghosh
Identity-Based Video Summarization
- Soummya Kulkarni, Darshana Bhagit, Masooda Modak
Security Testing for Blockchain Enabled IoT System
- A. B. Yugakiruthika, A. Malini
Two-Dimensional Software Reliability Model with Considering the Uncertainty in Operating Environment and Predictive Analysis
- Ramgopal Dhaka, Bhoopendra Pachauri, Anamika Jain
Object Recognition in a Cluttered Scene
- Rashmee Shrestha, Mandeep Kaur, Nitin Rakesh, Parma Nand
Breast Cancer Prediction on BreakHis Dataset Using Deep CNN and Transfer Learning Model
- Pinky Agarwal, Anju Yadav, Pratistha Mathur
A Comprehensive Tool Survey for Blockchain to IoT Applications
Hybrid ensemble for fake news detection: an attempt.
- Lovedeep Singh
Detectıon of Abnormal Activity at College Entrance Through Video Surveillance
- Lalit Damahe, Saurabh Diwe, Shailesh Kamble, Sandeep Kakde, Praful Barekar
Audio Peripheral Volume Automation Based on the Surrounding Environment and Individual Human Listening Traits
- Adit Doshi, Helly Patel, Rikin Patel, Brijesh Satasiya, Muskan Kapadia, Nirali Nanavati
A Survey: Accretion in Linguistic Classification of Indian Languages
- Dipjayaben Patel
The Positive Electronic Word of Mouth: A Research Based on the Relational Mediator Meta-Analytic Framework in Electronic Marketplace
- Bui Thanh Khoa
A Review: Web Content Mining Techniques
- Priyanka Shah, Hardik B. Pandit
GPS-Free Localization in Vehicular Networks Using Directional Antennas
- Parveen, Sushil Kumar, Rishipal Singh
An Improved Scheme in AODV Routing Protocol for Enhancement of QoS in MANET
- Amit Kumar Bairwa, Sandeep Joshi
Knowledge Management Framework for Sustainability and Resilience in Next-Gen e-Governance
- Iqbal Hasan, Sam Rizvi
Enriching WordNet with Subject Specific Out of Vocabulary Terms Using Existing Ontology
- Kanika, Shampa Chakraverty, Pinaki Chakraborty, Aditya Aggarwal, Manan Madan, Gaurav Gupta
- SSIC Proceedings
- Data Engineering
- Data Analysis
- Data Innovation and Management
- Data Network
Vivek Kumar Verma, Sumit Srivastava, Rohit Kumar Gupta
Arka Prokash Mazumdar
Book Title : Data Engineering for Smart Systems
Book Subtitle : Proceedings of SSIC 2021
Editors : Priyadarsi Nanda, Vivek Kumar Verma, Sumit Srivastava, Rohit Kumar Gupta, Arka Prokash Mazumdar
Series Title : Lecture Notes in Networks and Systems
DOI : https://doi.org/10.1007/978-981-16-2641-8
Publisher : Springer Singapore
eBook Packages : Intelligent Technologies and Robotics , Intelligent Technologies and Robotics (R0)
Copyright Information : The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022
Softcover ISBN : 978-981-16-2640-1 Published: 14 November 2021
eBook ISBN : 978-981-16-2641-8 Published: 13 November 2021
Series ISSN : 2367-3370
Series E-ISSN : 2367-3389
Edition Number : 1
Number of Pages : XXII, 681
Number of Illustrations : 79 b/w illustrations, 197 illustrations in colour
Topics : Data Engineering , Systems and Data Security , Data Structures and Information Theory , Artificial Intelligence , Big Data
Policies and ethics
- Find a journal
- Track your research
Software Engineering for Data Analytics
- Change Username/Password
- Update Address
- Payment Options
- Order History
- View Purchased Documents
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
Help | Advanced Search
Electrical Engineering and Systems Science > Signal Processing
Title: guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram.
Abstract: Electrocardiograms (ECG) are widely employed as a diagnostic tool for monitoring electrical signals originating from a heart. Recent machine learning research efforts have focused on the application of screening various diseases using ECG signals. However, adapting to the application of screening disease is challenging in that labeled ECG data are limited. Achieving general representation through self-supervised learning (SSL) is a well-known approach to overcome the scarcity of labeled data; however, a naive application of SSL to ECG data, without considering the spatial-temporal relationships inherent in ECG signals, may yield suboptimal results. In this paper, we introduce ST-MEM (Spatio-Temporal Masked Electrocardiogram Modeling), designed to learn spatio-temporal features by reconstructing masked 12-lead ECG data. ST-MEM outperforms other SSL baseline methods in various experimental settings for arrhythmia classification tasks. Moreover, we demonstrate that ST-MEM is adaptable to various lead combinations. Through quantitative and qualitative analysis, we show a spatio-temporal relationship within ECG data.
- Download PDF
- HTML (experimental)
- Other Formats
References & Citations
- Google Scholar
- Semantic Scholar
BibTeX formatted citation
Bibliographic and Citation Tools
Code, data and media associated with this article, recommenders and search tools.
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .