in many ways Modern applied research relies on a key algorithm called gradient descent. This is a process that is usually used to find the maximum or minimum value of a particular mathematical function-called the process of optimizing the function. It can be used to calculate anything, from the most profitable way of manufacturing a product to the best way to allocate shifts for workers.
However, despite this wide range of usefulness, researchers have never fully understood the situations in which algorithms struggle the most. Now, the new job explains it, Establish In essence, gradient descent solves a fundamentally difficult computational problem. The new results limit the type of performance researchers can obtain from the technology in specific applications.
“It has a worst-case hardness that is worth knowing,” said Paul Goldberg University of Oxford, co-author of the work and John Fearnley with Rahul Suwanee University of Liverpool and Alexandros Holland Oxford.I received it Best Paper Award In June of each year Computational Theory Seminar.
You can think of a function as a landscape, where the elevation of the land is equal to the value of the function (“profit”) at that particular location. Gradient descent searches for the local minimum of the function by finding the direction of the steepest ascent at a given location and searching downhill from it. The slope of the landscape is called gradient, hence the name gradient descent.
Gradient descent is an important tool for modern applied research, but it does not work well on many common problems. But prior to this research, there is no comprehensive understanding of what gradient descent is and when it gets stuck—another field of computer science called computational complexity theory helps to answer these questions.
“A lot of the work on gradient descent does not involve complexity theory,” said Costis Daskalakis Massachusetts Institute of Technology.
Computational complexity is the study of the resources (usually computational time) required to solve or verify solutions to different computational problems. Researchers divide the questions into different categories, and all questions in the same category share some basic computational characteristics.
For example—an example related to the new paper—imagine a town where there are more people than houses, and everyone lives in a house. Here is a phone book with the names and addresses of everyone in the town. You need to find two people who live in the same house. You know you can find the answer because there are more people than houses, but this may require some search (especially if they don’t share a last name).
This problem belongs to a complexity class called TFNP, which is an abbreviation of “full-function non-deterministic polynomial”. It is a collection of all computing problems, these problems are guaranteed to have solutions, and the correctness of their solutions can be quickly checked. Researchers focus on the intersection of two subsets of problems in TFNP.
The first subset is called PLS (Polynomial Local Search). This is a series of problems that involve finding the minimum or maximum value of a function in a specific area. Ensure that the answers to these questions can be found through relatively straightforward reasoning.
One problem that belongs to the PLS category is the task of planning a route that allows you to visit some fixed number of cities with the shortest travel distance, because you can only change the itinerary during the tour by switching the order of any pair of consecutive cities. It is easy to calculate the length of any suggested route, and because of the limitations in the way you can adjust the itinerary, it is easy to see which changes have shortened the itinerary. You will eventually find a route that cannot be improved by acceptable movement (local minimum).
The second subset of the problem is PPAD (polynomial parity parameter on a directed graph). The solution to these problems comes from a more complex process called the Brouwer fixed point theorem. The theorem says that for any continuous function, one point is guaranteed to remain constant-as we all know, a fixed point. The same is true in daily life. If you stir a glass of water, the theorem guarantees that there must absolutely be a water particle that ends where it started.