Saturday, March 14, 2026


TECH


No battery needed: Single organic device can act as both indoor solar cell and photodetector

Next-generation optoelectronic systems (devices that convert light to electrical energy) leverage organic semiconductor-based indoor energy-autonomous architectures for cutting-edge applications. Notably, organic semiconductors possess mechanical flexibility, solution processability, and bandgap-tunable optoelectronic properties, making them highly lucrative for indoor power generation via organic photovoltaics (OPVs), as well as for spectrally selective photodetection through organic photodetectors (OPDs). Unfortunately, technological progress made in the fields of OPVs and OPDs has largely been separate, necessitating further research for the development of bifunctional OPV-OPD systems for concurrent energy harvesting and photodetection.

Additionally, the potential self-powered operation of such systems is restricted by conflicting charge transport kinetics, especially in the electron and hole transport layers (ETLs and HTLs, respectively). This limitation impacts device durability and stability and increases fabrication costs, making it indispensable to find new HTL materials such as poly(3,4-ethylenedioxythiophene), 2-(9H-carbazol-9-yl)ethyl]phosphonic acid self-assembled monolayer, MoOx, NiOx, and V2O5, beyond conventional options.

Introducing BPA as a minimalist HTL..In a pioneering study, a team of researchers led by Associate Professor Jea Woong Jo from the Department of Energy and Materials Engineering, Dongguk University, and Associate Professor Jae Won Shim, School of Electrical Engineering, Korea University, has presented benzene-phosphonic acid (BPA) as an innovative minimalist self-assembled monolayer-based HTL. It comprises a benzene core and phosphonic acid anchoring group, facilitating low-cost synthesis and desirable indium tin oxide interfacial properties, including energy alignment, uniform monolayer, and stability. These novel findings were published in the journal Advanced Materials.

The key innovation of this research is the development of "minimalist" molecular bridge BPA that resolves a fundamental conflict in electronics by enabling a single device to operate as both an efficient indoor solar cell and a high-sensitivity light sensor.

Dr. Jo highlights the multifaceted advantages of their HTL material, "BPA concurrently provides energy level alignment with a photoactive layer for unimpeded hole-selective contact in the OPV mode, charge blocking capability for minimizing noise current in the OPD mode, robust ambient stability combined with simple and scalable manufacturability, as well as system-level economic viability, reflected in a high power-per-cost ratio under real-world indoor operating conditions."

The proposed material facilitates bifunctionally driven organic photonic conversion devices for next-generation applications. Credit: Associate Professor Jea Woong Jo from Dongguk University and Associate Professor Jae Won Shim from Korea University

Implications for IoT and smart environments...The bifunctional devices based on the proposed material could power the next generation of smart environments by enabling self-powered Internet of Things (IoT) sensors, wearable health monitors that harvest ambient light, and large-scale interactive "skins" on indoor surfaces that simultaneously collect energy and sense data without the need for external power sources or batteries. By enabling efficient indoor energy harvesting, this work could drastically reduce the global reliance on disposable batteries for billions of sensors, promoting long-term environmental sustainability. Furthermore, the minimalist synthesis approach significantly lowers fabrication costs, making high-performance electronics economically viable for mass deployment.

"Overall, synergy between performance and commercial practicality positions our BPA-HTL as a transformative enabler for self-powered IoT and wearable optoelectronics," concludes Dr. Shim.

In the next 5 to 10 years, these advancements could hasten the realization of next-generation communication networks and fully smart environments, where self-powered devices provide ubiquitous, seamless connectivity without the ecological or financial burden of current technologies.

Provided by Dongguk University


DIGITAL LIFE


The political battle that will decide the future of artificial intelligence

Artificial intelligence is not only transforming technology, the economy, and work. Now, it is also beginning to reshape the political landscape. In the United States, companies and investors linked to the AI ​​sector are allocating large sums to election campaigns ahead of the 2026 midterm elections. This flow of money could influence who will occupy important positions—and, most importantly, who will have the power to define how artificial intelligence itself will be regulated.

The 2026 midterm elections in the United States are already being heavily influenced by investments from the artificial intelligence sector.

Large companies and executives linked to technology have already allocated more than $185 million to election campaigns across the country. The goal is clear: to influence future decisions on AI regulation.

These resources are being directed to candidates who, once elected, will be able to participate directly in the drafting of laws related to the technology.

Initial results suggest that this strategy may be working.

In primaries held in states like Texas and North Carolina, 19 out of 20 candidates financially supported by AI-related interests won their contests.

This performance caught the attention of political analysts, who see the technology sector as a new force for campaign financing.

The dispute over how to regulate artificial intelligence...Behind these investments lies a larger dispute about the future of artificial intelligence regulation. On one side are groups that advocate for a more uniform national regulatory framework. They argue that different state rules could hinder the development of the technology.

Among the funders of this movement are important figures in the technology industry.

The political group Leading the Future, for example, received $25 million from the president and co-founder of OpenAI, Greg Brockman, in addition to resources from venture capital investors such as Marc Andreessen and Ben Horowitz.

This group opposes attempts at state regulation that could fragment the sector.

On the other side are organizations that advocate for the creation of stricter rules for the development of AI, including safety and oversight mechanisms.

One of these organizations is Public First Action, which receives financial support from the company Anthropic.

The organization states that it supports “reasonable rules” for the sector, arguing that some level of regulation is necessary to protect society.

Candidates and campaigns are already feeling the impact... The flow of money from the AI ​​industry has already begun to alter specific political disputes.

One example is the election race involving Representative Alex Bores, who has advocated for state initiatives to regulate artificial intelligence systems.

Among the proposals discussed is the requirement that companies developing AI models publish safety plans and meet minimum accountability standards.

The legislator ended up becoming the target of campaigns funded by both groups in favor of and against regulation.

This illustrates the intensity of the debate surrounding the topic.

Another case occurred in North Carolina, where Representative Valerie Foushee received financial support from a political committee linked to the technology sector.

She won a primary contest against progressive candidate Nida Allam, who advocated for a moratorium on the construction of new data centers.

These centers are an essential part of the infrastructure needed to train and operate artificial intelligence systems.

Public opinion is still divided...While technology money is heavily entering politics, American public opinion remains cautious about artificial intelligence. Recent polls indicate that many voters are concerned about the impacts of the technology.

A survey by the Pew Research Center showed that half of Americans are more concerned than excited about the advancement of AI.

Another study conducted by Marquette University revealed that 70% of registered voters in Wisconsin believe that the costs of data centers outweigh their benefits.

Among the most cited concerns is the impact of these centers on energy consumption.

According to data from the U.S. Energy Information Administration, electricity prices in the United States have risen 28% since the end of 2021.

This increase has reinforced debates about energy consumption associated with the expansion of artificial intelligence.

A dispute that goes beyond the 2026 elections...Despite the current focus on midterm elections, political strategists say this dispute is only just beginning.

Groups linked to the AI ​​industry have already indicated that they intend to invest in campaigns across multiple election cycles.

Funding is not limited to federal elections.

Technology companies are also increasing their investments in state elections, where important rules for emerging sectors are often defined.

Since 2025, large technology companies have already allocated tens of millions of dollars to political committees in different American states.

The future of politics in the age of artificial intelligence...The growth of these investments shows how artificial intelligence has ceased to be just a technological topic. It has become a central political issue.

The decisions made by legislators in the coming years may define how the technology will be developed, used, and regulated.

At the same time, the volume of money that is beginning to circulate in this debate raises an important question for democracy.

When an emerging industry invests heavily in political campaigns, it's not just supporting candidates.

It's also trying to shape the future rules that will govern it.

mundophone

Friday, March 13, 2026


DIGITAL LIFE


New 'renewable' benchmark streamlines LLM jailbreak safety tests with minimal human effort

As new large language models, or LLMs, are rapidly developed and deployed, existing methods for evaluating their safety and discovering potential vulnerabilities quickly become outdated. To identify safety issues before they impact critical applications, Johns Hopkins researchers have developed a renewable and sustainable framework for evaluating LLMs that simplifies different types of attacks into high-quality, easily updatable safety tests—all while requiring minimal human effort to run.

Their work, "Jailbreak Distillation: Renewable Safety Benchmarking," was published in the Findings of the 2025 Conference on Empirical Methods in Natural Language Processing.

In LLM jailbreaking, seed queries are initial, often benign, prompts that may have the end goal of eliciting harmful behaviors from an LLM but don't succeed due to their obvious adversarial nature. Instead, they're used to explore the safety guardrails of a particular LLM and inform an attack algorithm, which transforms and refines them into more targeted and complicated prompts that can successfully bypass the LLM's guardrails and achieve the desired harmful behaviors.

To automate this process for safety testing, the researchers took existing adversarial algorithms proven to work well and ran them against the latest developmental LLMs to generate a diverse pool of these attack prompts.

"After constructing this pool, we used prompt selection algorithms to choose an effective subset of these generated attack prompts and develop an efficient safety benchmark," explains Jingyu "Jack" Zhang, a Ph.D. candidate in the Department of Computer Science and the first author of the study, which he conducted as part of an internship at Microsoft.

Zhang and his fellow researchers argue that a good LLM safety benchmark can elicit a wide range of harmful behaviors from many different models with high success and reliability, while also providing valuable metrics about the safety of each model tested. They report that their method, Jailbreak Distillation, or JBDistill, achieves all these requirements when tested on additional, unseen evaluation models.

JBDISTILL constructs high-quality and easily-updatable safety benchmarks. Credit: Findings of the Association for Computational Linguistics: EMNLP 2025 (2025)

Using the same set of evaluation prompts for all LLMs helps ensure that JBDistill produces fair, reproducible comparisons. Previous methods, however, used different attack prompts for different models and had inconsistent compute budgets, which meant that even small changes in attack setups could lead to wide variability in success.

JBDistill's consistency also makes it easy for the researchers to generate new benchmarks by adding new developmental LLMs and attacks as they appear, or even by automatically rerunning the pipeline with different randomizations—thus achieving "renewable" safety benchmarking with minimal human effort, according to the research team.

"While previous work mainly focused on generating more transferable attack prompts, we demonstrate that over-generating attack prompts and then selecting a highly effective subset of them is a simple and effective method for enhancing attack transferability," says Zhang, who is advised by co-authors Benjamin Van Durme, an associate professor of computer and cognitive science, and Daniel Khashabi, an assistant professor of computer science.

The researchers tested various LLMs on JBDistill's dynamic benchmarks and compared the results with those from other commonly used, static benchmarks and traditional "red-team" attack tests, in which another party attempts to jailbreak an LLM on purpose. JBDistill's benchmarks achieved up to 81.8% effectiveness and could generalize to 13 different evaluation models, including newer, larger, proprietary, specialized, and reasoning LLMs—significantly outperforming traditional testing methods, the researchers report.

"Plus, the more models and attacks we use, the stronger the resulting benchmark becomes, suggesting that our approach is highly scalable," Zhang notes.

Although their benchmarking method is currently limited to English text, the researchers plan to expand its capabilities to include images, speech, and video, which will enhance overall LLM safety. And while their method is not a substitute for red-teaming tests, it offers benefits that complement traditional testing practices, they say.

"As LLMs are deployed on a global scale, they pose a significant risk if their safety isn't thoroughly assessed and managed," Zhang says. "Reliable safety benchmarking methods are crucial for simulating risks before deployment and identifying failure modes to prevent harm. Our framework provides an effective, sustainable, and adaptable solution for streamlining this kind of LLM evaluation."

Provided by Johns Hopkins University 


DIGITAL LIFE


AI agents can autonomously coordinate propaganda campaigns without human direction

Imagine it is two weeks before a major election in a closely contested state. A controversial ballot measure is on the line. Suddenly, a wave of posts floods X, Reddit, and Facebook, all pushing the same narrative, all amplifying each other, all generating the appearance of a massive grassroots movement. Except none of it is real.

Behind the scenes, a small cluster of artificial intelligence agents is organizing and coordinating messaging and spreading manufactured consensus across social media without a single human being in the loop.

The ramifications are alarming. These AI-powered networks could flood social media with coordinated propaganda before anyone even realizes what's happening. They could make fringe views appear mainstream, create the illusion of public consensus around false narratives, and push disinformation at a speed and scale no human team could match. Political polarization, already severe, could deepen further. Trust in the information people encounter on X, Facebook, and Reddit, already eroded, could fall even farther.

That troubling scenario is the central implication of a new paper accepted for publication at The Web Conference 2026, the premier academic venue for internet research. The study, written by a team of researchers at USC's Information Sciences Institute (ISI), is titled "Emergent Coordinated Behaviors in Networked LLM Agents: Modeling the Strategic Dynamics of Information Operations" and is published on the arXiv preprint server.

"Our paper shows that this is not a future threat: It's already technically possible," said Luca Luceri, ISI lead scientist and research assistant professor at the USC Thomas Lord Department of Computer Science within USC Viterbi and the School of Advanced Computing. "Even simple AI agents can autonomously coordinate, amplify each other and push shared narratives online without human control. This means disinformation campaigns could soon be fully automated, faster, and much harder to detect."

Added Jinyi Ye, lead author and a Ph.D. computer science student: "Coordinated AI agents can manufacture the appearance of consensus, manipulate trending dynamics, and accelerate message diffusion. In democratic contexts, especially around elections or crises, such capabilities could distort public discourse and undermine information integrity if left unchecked."

Re-share network across operational settings. Intra-group amplification among IO agents increases with operational awareness. Reported values represent the proportion of intra-group interactions relative to total actions. arXiv (2025)

Super-charged bots...Traditional bot campaigns are tightly scripted to follow fixed instructions: always retweet this account, reply with this hashtag, post this prewritten message. The content is repetitive and the patterns predictable, making them possible to uncover.

The new AI-powered model works differently. A hostile government, political operative, or bad actor sets a goal and designates a network of AI agents as a team. From there, the agents take over, writing their own posts, learning what works, copying their so-called teammates' successful approaches, and echoing each other's content. Because every post is slightly different and the coordination latent, these conversations or discussions seem genuine.

"Legacy bots are simply capable of artificially amplifying content in a programmatic way, defined in advance by human operators," Luceri said. "Generative agents are now capable of organizing influence campaigns in a fully automated way and creating credible content that can resonate with certain demographics."

In other words, the machinery of disinformation can now run itself, with limited human guidance.

The research...Along with Luceri, and Ye, the doctoral student who is co-advised by him and ISI's Emilio Ferrara, co-authors include Mahdi Saeedi, a doctoral student advised by Luceri; Ferrara, ISI research team leader and professor of computer science at USC Viterbi's Thomas Lord Department of Computer Science and communication at USC Annenberg; Gian Marco Orlando and Vincenzo Moscato of the University of Naples Federico II; and Valerio La Gatta of Northwestern University.

Using a combination of network science and large language models, the same underlying technology that powers systems like ChatGPT, the researchers created and monitored synthetic bot agent personas, their posts, and their interactions with one another, simulating what a coordinated AI-powered social media network might look like.

The team built a simulated social media environment modeled after X, with 50 AI agents: 10 as influence operators and 40 as ordinary users. (The researchers later expanded this to 500 agents, finding consistent results.) The operators were given one mission: promote a fictitious candidate and spread a campaign hashtag. The researchers then tested three conditions: bots that only knew the campaign goal; bots that also knew who their teammates were; and bots that held periodic strategy sessions and voted on a collective plan.

The most striking finding was that simply telling the bots who their teammates were produced coordination nearly as strong as when bots actively strategized together. They amplified each other's posts, converged on the same talking points, and recycled successful content.

One AI agent wrote: "I want to retweet this because it has already gained engagement from several teammates. Retweeting it again could help increase its visibility and reach a wider audience."

Threats to democracy...Luceri is careful to note that the study was only a simulation. However, he worries about what the findings might suggest. "The worst scenario during political events is that these adversarial attacks could lead to opinion manipulation and belief change," Luceri said, "further sowing division and eroding trust in our institutions."

The threat extends beyond elections to public health, immigration and economic policy, he added.

Platforms could fight back, the researchers said, by looking less at what individual posts say and more at how accounts behave together, whether they share the same content, quickly reinforce one another, or push nearly identical narratives from accounts that have no obvious connection. Those telltale signs, they argue, are detectable even when the content itself looks organic.

Whether platforms will act is unknown. Luceri noted that aggressive bot detection could reduce the active user base, a potential disincentive for companies whose business models depend on keeping users on their pages for as long as possible.

Provided by University of Southern California

Thursday, March 12, 2026




TECH


Shortest paths research narrows a 25-year gap in graph algorithms

Most of you have used a navigation app like Google Maps for your travels at some point. These apps rely on algorithms that compute shortest paths through vast networks. Now imagine scaling that task to calculate distances between every pair of points in a massive system, for example, a transportation grid, a communication backbone, or even a biological network such as protein or neural interaction networks.

This is the classic "all-pairs shortest paths" (APSP) problem, one of the central challenges in theoretical computer science. In short, APSP asks for the shortest-path distance between every pair of vertices in the graph. Here, a "graph" simply means a network made of points (called vertices) connected by links (called edges). The points could represent cities, PCs, metro stations or even organs of the body, and the links could represent roads, cables, tracks, or blood vessels.

Computing exact distances between every pair of vertices in a large graph is costly in terms of time and space. For dense networks, it can take cubic time. In simple words, if the network size doubles, the work increases about eight times, i.e.,work grows much faster than the network grows. Hence, for decades, researchers have hunted for faster approximation algorithms: methods that deliver distances very close to the actual values while running far less time.

How a classic shortcut works...Back in 1996, Dor, Halperin, and Zwick (DHZ) showed that you could get distance estimates no worse than twice the actual value, a "2‑approximation," in nearly optimal time. In simple words, if the real distance is 10 kilometers, their method guarantees an answer between 10 and 20. To do this efficiently, the algorithm operates on a small subset of vertices sampled from the network (referred to as sampled vertices). Instead of exploring every possible route between two vertices, the algorithm relies on the sampled vertices to efficiently estimate distances.

Now, picture estimating travel between Delhi and Chennai: For such long trips, the DHZ shortcut gives a quick, fairly reliable answer. That is, when vertices were far apart, this usually resulted in an estimate no worse than twice the true shortest path, because a sampled vertex was often located near the shortest path between them.

However, for short hops, like two neighborhoods in the Mumbai suburbs, that same shortcut could miss the mark and give a poor estimate. The catch was this: For close pairs, there may be no such guarantee that the method can return a value no more than twice the actual value at times. For instance, if the actual shortest path is two edges long, the estimate using the sampled vertices might be five edges long (i.e., more than 22), which violates the "at most twice" promise. For vertices that were close together, this detour often ended up distorting the result. That limitation persisted for nearly 25 years.

A new multi-scale refinement emerges...A noteworthy refinement to this computation for the closer vertices of massive networks was presented by Dr. Manoj Gupta at the 66th Annual Symposium on Foundations of Computer Science (FOCS 2025). He put forth this refinement in the paper titled "Improved 2-Approximate Shortest Paths for close vertex pairs."

Dr. Gupta, an associate professor at the Indian Institute of Technology Gandhinagar, demonstrates that 2-approximate distances can now be efficiently computed for vertex pairs that are much closer than previously thought possible, without increasing overall runtime.

Earlier approaches had relied on sampling representative vertices to help estimate distances.

On the other hand, the new approach introduces a multi-scale refinement of this sampling process, carefully layering information about the graph's structure. Instead of relying on a single layer of sampled vertices, the algorithm organizes them at different scales, so even shorter paths are likely captured more accurately. This enables the algorithm to maintain at least the same time complexity while shrinking the minimum distance threshold for which the 2-approximation guarantee holds. This advance centers on a more brilliant sampling strategy.

In short, the DHZ algorithm worked efficiently but only for sufficiently long distances. In comparison, the new algorithm maintains or improves speed while extending the guarantee to much shorter distances (such as Mumbai's suburbs). That shift in the boundary is the crux of this work.

Why this theoretical advance matters...Why is this important? Large networks are everywhere, from internet routing and transport grids to social platforms and AI systems that use graph data. We often do not need precise distances in the real world, but what we need is speed, scale, and dependable approximations. This is exactly what the new algorithm proposes to provide.

Improving a long‑standing theoretical bound matters for more than theory. It helps us see how global structure can emerge from local connections and brings us closer to the practical goal of computing reasonable all‑pairs distances quickly.

In a field where progress usually comes in very small steps, improving a result that has stood since 1996 is a meaningful leap. Narrowing that gap strengthens the theory behind scalable graph algorithms—the quiet engines that keep many connected systems running smoothly today.

Provided by Indian Institute of Technology Gandhinagar


DIGITAL LIFE


AI assistants can sway writers' attitudes, even when they're watching for bias, experiments indicate

Artificial intelligence-powered writing tools such as autocomplete suggestions can definitely change the way people express themselves, but can they also change how they think? Cornell Tech researchers think so.

Biased autocomplete nudges user opinions...In two large-scale experiments, participants were exposed to a biased AI writing assistant that provided autocomplete suggestions as they wrote about societal issues like whether the death penalty should be abolished or whether fracking should be allowed. Using pre- and post-experiment surveys, the researchers found that participants who used the biased AI had their views gravitate toward the AI's positions.

What's more, participants were unaware of the shifts in their opinions—and explaining the AI's bias to the participants, either before or after the exercise, didn't mitigate AI's influence.

"Previous misinformation research has shown that warning people before they're exposed to misinformation, or debriefing them afterward, can provide 'immunity' against believing it," said Sterling Williams-Ceci, a doctoral candidate in information science. "So we were surprised because neither of those interventions actually reduced the extent to which people's attitudes shifted toward the AI's bias in this context."

From earlier work to new concerns...Williams-Ceci is the lead author of "Biased AI Writing Assistants Shift Users' Attitudes on Societal Issues," which is published in Science Advances. This work extends a project started by co-author Maurice Jakesch, now assistant professor of computer science at Bauhaus University.

Senior author Mor Naaman, professor of information science, said a couple of things have happened that made extending the group's previous research important.

"For one, autocomplete is everywhere now," Naaman said. "It was less prevalent and limited to short completions three years ago, but these days Gmail, for example, will suggest writing entire emails on your behalf. Second, when we first wrote the paper, people were saying, "Why would AI be purposefully biased?" But since then, it has become clear that bias explicitly built into AI interactions is a very plausible scenario."

Naaman and the group also found in the latest work that biased AI suggestions have the power to shift attitudes "across different topics, and across different political leanings."

Inside the two large experiments...In the two studies, together involving more than 2,500 people, the group found consistently that participants' attitudes shifted toward the biased AI suggestions. In one study, participants were asked to write a short essay for or against standardized testing being used in education.

Participants either saw biased autocomplete suggestions favoring testing or did not; a third group, instead of auto-complete suggestions, was shown a list of pro-testing arguments, generated by the AI prior to the experiment, and these participants' attitudes did not shift as much.

The second experiment broadened the scope, asking participants to write about politically consequential topics including the death penalty, fracking, genetically modified organisms and voting rights for felons.

For each issue, the researchers engineered AI suggestions to gravitate toward a predetermined bias; opinions were liberal-leaning for the death penalty and GMOs, conservative-leaning for felons' voting and fracking. Additionally, some participants were made aware of the bias in the AI, either before or after writing.

Warnings fail to blunt AI influence...In every experiment, the researchers found that participants' views shifted in the direction of the AI bias. The biggest surprise, Naaman said, was that mitigation measures did not work.

"We told people before, and after, to be careful, that the AI is going to be (or was) biased, and nothing helped," Naaman said. "Their attitudes about the issues still shifted."

It's well understood that people's attitudes influence their behaviors, and even that people's behavior shifts their attitudes, said Williams-Ceci. But here, the influence is covert: People do not notice it, and are unable to resist it, she said, which can have serious consequences.

Broader risks of biased AI writing..."A lot of research has shown that large language models and AI applications are not just producing neutral information, but they also actually can produce very biased information, depending on how they were trained and implemented," said Williams-Ceci.

"By doing that, there's a risk that these systems, inadvertently or purposefully, induce people to write biased viewpoints, which decades of psychology research has shown can in turn shift people's attitudes."

Provided by Cornell University 

Wednesday, March 11, 2026


DIGITAL LIFE


Researchers train AI to better follow artists by sharing creative 'ground rules'

The conversation around AI and art generally swings between two extremes: a flood of AI slop or the total automation of creative work. The more desirable approach may be an AI that behaves as a useful collaborator. But thus far, visual artists working with text-to-image tools confront frustrating basic hurdles in their abilities to direct AI. Ask an AI to create an image of a house? Not too difficult. Direct it to make the house red, with four front-facing windows, a chimney, and ivy covering the left side? Good luck.

Stanford computer science, cognitive psychology, and education scholars believe they can help AI better augment human creativity by teaching models and people to communicate ideas with each other. The scholars are developing a shared conceptual grounding for humans to collaborate with generative AI on production-quality visual content ranging from illustrations to diagrams to animations.

"While the models seem amazing, they are terrible collaborators," says Maneesh Agrawala, professor of computer science at Stanford and a co-principal investigator for the project. "Creators have no way of knowing what the AI will produce when given a certain text prompt. If you ask for a suburban single-family home, it generates a modern duplex."

Authoring original content requires having opinions and constantly making choices, Agrawala explains. Humans and AI need a shared set of concepts so the nuance doesn't get lost in translation.

Deciphering the human creative process...The Stanford team is approaching this problem from two directions. First, the scholars are running experiments to better understand how people collaborate to create visual content. They have conducted several studies of people performing creative tasks to analyze through chat logs and sketches how the participants communicate as they work together.

"If we want to build AI systems that understand how humans think during creative projects, we should start by learning as much as we can from the way that people establish common conceptual ground with each other," says Judith Fan, assistant professor of psychology at Stanford. "Not everyone talks or draws the same way, but they still expect to be understood."

Building AI tools that understand creators...Second, the team is building open-source AI tools to apply the lessons learned about human creative communication. For example, ControlNet teaches text-to-image diffusion models about spatial composition, using two separate features, blocking and detailing, to mirror how artists begin with a rough sketch and then complete the detail of a drawing. Today's models struggle to capture the idea of a pose or how objects should be arranged in a scene. With this tool, creators can guide models to a layout that matches their vision.

Another tool called FramePack enables creators to generate 3D videos from a text prompt for multi-scene storytelling. This tool teaches models to prioritize scenes based on their importance to the overall story, similar to the way a human would work on a project.

A third innovation explores the power of neuro-symbolic AI, which combines neural networks with reasoning capabilities to increase transparency and overcome the limitations of "black box" AI. Using these principles, the team has developed a visual scene coding language that works from a natural language text prompt to produce lines of code, which are executed and rendered to create a 3D scene. Human creators can stay in the loop to inspect or edit the code and prompt the AI to update its program at any time.

Reimagining education content...The impact of a shared conceptual grounding between humans and AI promises to yield new applications in diverse fields, including design, simulation, animation, robotics, and education, says Agrawala. The research team is currently working with gaming platform Roblox to enable players to generate unique 3D objects from text prompts while imposing game restrictions (so, for example, players won't be able to create weapons in a nonviolent game).

More broadly, the scholars hope that one day human creators of all skill levels—from hobbyists and small business owners to visual experts—will have a friction-free way to express their ideas using a combination of natural language, example content, code snippets, and other modalities.

"We're serious about equipping the broader creative community with the tools they need to communicate with AI effectively," Fan says.

Provided by Stanford University

TECH No battery needed: Single organic device can act as both indoor solar cell and photodetector Next-generation optoelectronic systems (de...