Snowflake report unearths Python as the programming language of choice for AI development, while the processing of unstructured data has increased by 123 percent in the past year
Dubai, UAE, 20th March, 2024– Large language models (LLMs) are increasingly being used to create chatbots, according to Data Cloud company Snowflake. As generative AI continues to revolutionize the industry, chatbots have grown from being approximately 18 percent of the total LLM apps available, to now encompassing 46 percent as of May 2023 — and that metric is only climbing. In addition, after surveying Streamlit’s developer community, it was found that nearly 65 percent of respondents noted that their LLM projects were for work purposes, signaling a shift in the importance of harnessing generative AI to improve workforce productivity, efficiency, and insights.
These results are based on usage data from more than 9,000 Snowflake customers, and summarized in Snowflake’s new “Data Trends 2024” report. The report focuses on how global enterprise business and technology leaders are leveraging resources such as AI to build their data foundation and transform future business operations. The new data shows a shift from LLM applications with text-based input (2023: 82%, 2024: 54%) to chatbots with iterative text input, offering the ability to have a natural conversation.
“Conversational apps are on the rise, because that’s the way humans are programmed to interact. And now it is even easier to interact conversationally with an application,” explains Jennifer Belissent, Principal Data Strategist at Snowflake. “‘We expect to see this trend continue as it becomes easier to build and deploy conversational LLM applications, particularly knowing that the underlying data remains well governed and protected. With that peace of mind, these new interactive and highly versatile chatbots will meet both business needs and user expectations.”
Over 33,000 LLM Applications in Nine Months
The report also shows that 20,076 developers from Snowflake’s Streamlit community of developers have built over 33,143 LLM apps in the past nine months. When it comes to developing AI projects, Python is the programming language of choice due to its ease of use, active community of developers, and vast ecosystem of libraries and frameworks. In Snowpark, which enables developers to build apps quickly and cost-effectively, the use of Python grew significantly faster than that of Java and Scala (in the past year)— Python grew by 571 percent, Scala by 387 percent, and Java by 131 percent. With Python, developers can work faster, accelerating prototyping and experimentation—and therefore overall learnings as developer teams make early forays into cutting-edge AI projects.
In terms of where application development is taking place, the trend is towards programming LLM applications directly on the platform on which the data is also managed. This is indicated by a 311 percent increase in Snowflake Native Apps – which enables the development of apps directly on Snowflake’s platform – between July 2023 and January 2024. Developing applications on a single data platform eliminates the need to export data copies to third-party technologies, helping develop and deploy applications faster, while reducing operational maintenance costs.
Data Governance in Companies is Growing in Importance
With the adoption of AI, companies are increasing analysis and processing of their unstructured data. This is enabling companies to discover untapped data sources, making a modern approach to data governance more crucial than ever to protect sensitive and private data. The report found that enterprises have increased the processing of unstructured data by 123 percent in the past year. IDC estimates that up to 90 percent of the world’s data is unstructured video, images, and documents. Clean data gives language models a head start, so unlocking this untapped 90 percent opens up a number of business benefits.
“Data governance is not about locking down data, but ultimately about unlocking the value of data,” said Belissent. “We break governance into three pillars: knowing data, securing data and using data to deliver that value. Our customers are using new features to tag and classify data so that the appropriate access and usage policies can be applied. The use of all data governance functions has increased by 70 to 100 percent. As a result, the number of queries of protected objects has increased by 142 percent. When the data is protected, it can be used securely. That delivers peace of mind.”
“Taken individually, each of these trends is a single data point that shows how organizations across the globe are dealing with different challenges. When considered together, they tell a larger story about how CIOs, CTOs, and CDOs are modernizing their organizations, tackling AI experiments, and solving data problems — all necessary steps to take advantage of the opportunities presented by advanced AI,” says Belissent. “The important thing to understand is that the era of generative AI does not require a fundamental change in data strategy. It does, however, require accelerated execution of that strategy. It requires breaking down data silos even faster and opening up access to data sources, wherever they may be in the company or across a broader data ecosystem.”
Report Methodology
The Snowflake Data Trends Report 2024 is generated from fully aggregated, anonymized data detailing usage of the Snowflake Data Cloud and its integrated features and tools. In this report, we examine patterns and trends in data and AI adoption across more than 9,000 global Snowflake accounts. The Snowflake Data Cloud provides insight into the state of data and AI, including which technologies are the fastest growing. Note that usage attributable to internal consumption, if any, has been removed and is not reflected in any of the metrics contained herein. The accounts and usage reflected in this report represent every major industry and include both longtime Snowflake users and others who only recently joined the Data Cloud.
About Snowflake
Snowflake enables every organization to mobilize their data with Snowflake’s Data Cloud. Customers use the Data Cloud to unite siloed data, discover and securely share data, power data applications, and execute diverse AI/ML and analytic workloads. Wherever data or users live, Snowflake delivers a single data experience that spans multiple clouds and geographies. Thousands of customers across many industries, including 691 of the 2023 Forbes Global 2000 (G2K) as of January 31, 2024, use Snowflake Data Cloud to power their businesses. Learn more at snowflake.com.