The Unsexy Side of AI Jobs: Infrastructure, Cleaning and Portability

 

Image sourced from TechnologyAdvice.com

This week, I read this article in the WSJ:

https://www.wsj.com/tech/ai/ai-jobs-demand-tech-layoffs-5b7344c0?mod=Searchresults_pos9&page=1

The article states that AI jobs are in higher demand than other tech jobs, which continue to experience layoffs. This seems to reflect an increased interest from firms in building deep learning algorithms and other AI software to drive decision-making and revenue. The article points out that firms are seeking to find computer scientists who can develop and author these deep-learning applications. What I didn't see was any reference to firms also filling support positions for these applications.

Deep learning relies on feeding the AI clean, accurate data on which to learn, and having even more data to test the model, to ensure it works. So not only do firms need big data with which to make ongoing decisions, but they also need large amounts of data to train and refine their AI. All this requires massive amounts of accessible, clean, and accurate data.

The problem is that most firms don't have large, ready-to-use data sets available. Data may be stored inside particular applications, like Epic, Excel, or PowerBI. These data sets may need to be combined, cleaned to remove erroneous entries, and managed so that data is labeled and usable. The challenge will be to have data continually flowing into these comprehensive datasets from their original sources. 

To better understand this process, I spoke to Seth Zehler, VP of Trading and Analytics at Invenergy, a large local privately-held energy firm that uses AI for operations decisions. He also happens to be my husband.

According to Zehler, 

"Our CEO asked his staff if we needed to hire a consultant to come in and tell us where we needed to implement AI. Myself and our chief of strategy both told him that we weren’t anywhere near ready for AI because our data was spread across multiple incompatible systems with no way to aggregate the data into AI models. We needed to clean our data house first before we could do any AI."

This data portability issue applies in healthcare as well. Healthcare systems have huge amounts of potentially usable data, but being able to efficiently access data is problematic, and requires specific data requests in Epic or similar electronic health records. 

Zehler also noted, 

"Data engineers are the oilfield workers of the data economy. They mine the data and prepare it for the “sexy” job titles like “Data Scientist” or “Machine Learning Engineer”. Without the data engineer, AI has no fuel for its engine." 

In conclusion, firms may be getting ahead of themselves. To be sure, models and algorithms need to be built, but firms need to remember to focus not just on the novel, sexy AI applications, but on the underlying infrastructure as well. This may require lower-level employees to capture, clean, and monitor data sets until the software can do that independently. And in fact, firms may need to invest in the unsexy jobs first, to ensure the highly-paid AI experts can provide value commensurate with their compensation. As is often the case in business operations, the rank-and-file employees provide the critical foundation for efficient operations, and those at the top cannot be effective without them. 

Comments

Popular posts from this blog

Why are urban, professional women buying construction worker overalls?

Digital AI Operations: Could Healthcare Benefit?

Why Can't We Fix Mental Health Care?