. | . |
Faster computing results without fear of errors by Adam Zewe, MIT News Office Boston MA (SPX) Jun 10, 2022
Researchers have pioneered a technique that can dramatically accelerate certain types of computer programs automatically, while ensuring program results remain accurate. Their system boosts the speeds of programs that run in the Unix shell, a ubiquitous programming environment created 50 years ago that is still widely used today. Their method parallelizes these programs, which means that it splits program components into pieces that can be run simultaneously on multiple computer processors. This enables programs to execute tasks like web indexing, natural language processing, or analyzing data in a fraction of their original runtime. "There are so many people who use these types of programs, like data scientists, biologists, engineers, and economists. Now they can automatically accelerate their programs without fear that they will get incorrect results," says Nikos Vasilakis, research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. The system also makes it easy for the programmers who develop tools that data scientists, biologists, engineers, and others use. They don't need to make any special adjustments to their program commands to enable this automatic, error-free parallelization, adds Vasilakis, who chairs a committee of researchers from around the world who have been working on this system for nearly two years. Vasilakis is senior author of the group's latest research paper, which includes MIT co-author and CSAIL graduate student Tammam Mustafa and will be presented at the USENIX Symposium on Operating Systems Design and Implementation. Co-authors include lead author Konstantinos Kallas, a graduate student at the University of Pennsylvania; Jan Bielak, a student at Warsaw Staszic High School; Dimitris Karnikis, a software engineer at Aarno Labs; Thurston H.Y. Dang, a former MIT postdoc who is now a software engineer at Google; and Michael Greenberg, assistant professor of computer science at the Stevens Institute of Technology.
A decades-old problem The Unix shell remains popular, in part, because it is the only programming environment that enables one script to be composed of functions written in multiple programming languages. Different programming languages are better suited for specific tasks or types of data; if a developer uses the right language, solving a problem can be much easier. "People also enjoy developing in different programming languages, so composing all these components into a single program is something that happens very frequently," Vasilakis adds. While the Unix shell enables multilanguage scripts, its flexible and dynamic structure makes these scripts difficult to parallelize using traditional methods. Parallelizing a program is usually tricky because some parts of the program are dependent on others. This determines the order in which components must run; get the order wrong and the program fails. When a program is written in a single language, developers have explicit information about its features and the language that helps them determine which components can be parallelized. But those tools don't exist for scripts in the Unix shell. Users can't easily see what is happening inside the components or extract information that would aid in parallelization.
A just-in-time solution This avoids another problem in shell programming - it is impossible to predict the behavior of a program ahead of time. By parallelizing program components "just in time," the system avoids this issue. It is able to effectively speed up many more components than traditional methods that try to perform parallelization in advance. Just-in-time parallelization also ensures the accelerated program still returns accurate results. If PaSh arrives at a program component that cannot be parallelized (perhaps it is dependent on a component that has not run yet), it simply runs the original version and avoids causing an error. "No matter the performance benefits - if you promise to make something run in a second instead of a year - if there is any chance of returning incorrect results, no one is going to use your method," Vasilakis says. Users don't need to make any modifications to use PaSh; they can just add the tool to their existing Unix shell and tell their scripts to use it.
Acceleration and accuracy It also boosted the speeds of scripts that other approaches were not able to parallelize. "Our system is the first that shows this type of fully correct transformation, but there is an indirect benefit, too. The way our system is designed allows other researchers and users in industry to build on top of this work," Vasilakis says. He is excited to get additional feedback from users and see how they enhance the system. The open-source project joined the Linux Foundation last year, making it widely available for users in industry and academia. Moving forward, Vasilakis wants to use PaSh to tackle the problem of distribution - dividing a program to run on many computers, rather than many processors within one computer. He is also looking to improve the annotation scheme so it is more user-friendly and can better describe complex program components. This work was supported, in part, by Defense Advanced Research Projects Agency and the National Science Foundation.
Research Report:"Practically Correct, Just-in-Time Shell Script Parallelization"
Nonprogrammers are building more of the world's software - a computer scientist explains 'no-code' Dayton OH (SPX) May 19, 2022 Traditional computer programming has a steep learning curve that requires learning a programming language, for example C/C++, Java or Python, just to build a simple application such as a calculator or Tic-tac-toe game. Programming also requires substantial debugging skills, which easily frustrates new learners. The study time, effort and experience needed often stop nonprogrammers from making software from scratch. No-code is a way to program websites, mobile apps and games without using codes or ... read more
|
|
The content herein, unless otherwise known to be public domain, are Copyright 1995-2024 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us. |