Solving Optimization Problems with Quantum Algorithms with Daniel Egger: Qiskit Summer School 2024

2.17k views9788 WordsCopy TextShare

Qiskit

In this course we will cover combinatorial optimization problems and quantum approaches to solve the...

Video Transcript:

hello everybody my name is Daniel edar and today I will be talking about Quantum combinatorial optimization and in particular throughout these courses we will be discussing the problem classes that we consider how we go from quadratic unconstrained binary optimization proc problems to ising hamiltonians and then we'll start looking at how we can deal with these problems on quantum computers we'll discuss the adiabetic theorem adiabetic ining and then we'll discuss qaoa and then extensions of qaa and finally some considerations when executing qaoa on Quantum Hardware so without further Ado let's start talking about combinatorial optimization and

in particular problem classes now combinatorial optimization is a topic that is present in many different Industries and I want to First go through one or two examples so that we really understand what we mean by combinatorial optimization now one example that is often very present in industry is portfolio optimization the canonical example of portfolio optimization is the mark of its portol pfolio optimization the goal here is the following we have an investment universe made of assets and we want to select the best assets in this universe in order to maximize our expected return and to

minimize the risk to which we are exposed and the way that you can interpret this is that each dot here corresponds to a possible combination of assets and we want to find the combination of assets here that Max maximizes the return and minimizes the risk now there are many different formulations of portfolio optimization one that we are typically encountering very often is with continuous weights as shown here and what we have in this particular case is a certain amount of our Capital allocated to the different assets and then we have a quadratic term here which

expresses the risk in the portfolio namely by how much the portfolio fluctuates and then here we have The Return part and basically we minimize this objective under certain constraints for instance we want to allocate all of our Capital now we can also come up with a binary formulation of this optimization problem where the decision is simply do I invest in an asset or do I not invest in this asset and this is the formulation that is shown here below where the decision variables now are binary either zero or a one and then we can also

impose budget constraints like we say we want to invest into a certain number of assets and then of course we have mixed formulations to these problems where we might have continuous weights but certain binary constraints or maybe the other way around right so there's a very large Spectrum here of optimization possible within this simple example that we showed going forward another type of combinatorial optimization that is often used as an example to illustrate combinatorial optimization is the maximum cut problem you might already have heard about it within this lecture series now the way that Max

cut works is the following we have a graph and we want to partition the nodes of this graph into two sets in such a way that the number of edges that is traversed by the cut is maximum and so you can see this cut here drawn in a pict victorial fashion if I want to put this out in a mathematical formulation this would look like this I have my graph with an edge set e and a node set V and N is the number of nodes and what I want to do is I want to

maximize the number of edges that are traversed by a cut and here for each node in the problem I have a decision variable and this decision variable is either zero or one depending on which side of the cut the node is on so you can see in this diagram here at the top that this purple these two purple nodes are on the same side of the cut and these gray nodes on the other side of the cut and this then basically means that I have a cut that goes through these edges here traversed by the

line that I now highlighted now an important thing to know is that for a lot of optimization problems there are already very efficient and Powerful classical solutions that exist some are heuristic and others also have guarantees for instance there is the celebrated German Williamson algorithm which already gives us an approximation ratio of around 87 uh 88% of the size of the maximum cut so this is already pretty decent and this is an algorithm that runs in polinomial time and it provably gives you Cuts with this particular size here so keep this in mind when we

deal with combinatorial optimization we are always going to have classical algorithms that are extremely performant at dealing with these problems now as already hinted at there is a whole spectrum of types of problems to deal with there are problems with continuous variables only and these are typically fairly easy to tackle with on classical machines and the reason for this is that we can exploit gradients to efficiently navigate these problems and then we have problems here that have integer variables binary variables but also that mix let's say integer variables and continuous ones and there are many

different classical methods to try and address these like for instance Brute Force search which by the name doesn't scale particularly well but also warm stop methods for instance relaxations that take us from a discrete space to a continuous one and then we apply some rounding methods on them semi-definite programming for instance German Williamson relies on this is a way to uh tackle these problems and then there are a lot of very performant classical solvers like clex and gurobi that can very efficiently solve some of these problems now when we look at Quantum methods to deal

with these I think we need to make a bit of a distinction between noisy hardware and fault tolerant hardware for methods that run on noisy Hardware which is typically what we are dealing with we have mainly her istics and this is what we will explore in this lecture series well specifically we will talk a lot more about qaoa for instance and how it works right so keep in mind also that when we talk about Quantum combinatorial optimization the nature of the algorithms that we run is going to be somewhat influenced by the type of hardware

and here we are going to be focusing on Hardware that is noisy now let's take a look at an example of a combinatorial optimization program namely a quadratic program this is a program that is this defined by a quadratic form as shown here we have the decision variables X connected by a matrix and then we can also have a linear term on it and typically we will be interested in minimizing this particular formulation here subject to various constraints for instance the ones that you can see here right now as mentioned the decision variables X they

can be continuous integer or binary and going forward we will be interested in the binary case of course if you're looking at integer which is also difficult you can reduce that to a binary formulation as well now one thing also to keep in mind is that a quadratic program supports both equality constraints and inequality constraints as you can see here now you will see in the following parts of this lecture the quantum computers they tend to support a way to describe these particular quadratic formulations in iing hamiltonian and this means that we need to be

able to recast these equality and inequality constraints into a quadratic form and the way that you can do this is fairly easy if I take the following equality constraint I can then recap cast this into a quadratic form where I take the difference between the left hand side and the right hand side and I Square it and I penalize it by this Factor M here this is called the Big M formulation it basically allows me to go from an equality constraints to a quadratic unconstrained binary optimization problem you can see here that this is no

longer a constraint but a quadratic penalty term that would then end enter into this part of the problem that I'm minimizing inequality constraints can also be dealt with in this formulation but they do require a little bit more thinking because we now have this inequality that we need to deal with you can recast this into an equality constraint by introducing a slack variable here that is greater than zero right and so this then once again allows us to recover a quadratic un constrained binary optimization that we will rely upon in the next parts of the

lecture and then finally you can also convert integer to Binary conversion for instance if I have an integer that is Con constrained to a range let's say 0 to 7 I can very easily replace this single integer with three binary variables as shown here so with that we're nearing the end of the first part where I would like to point point out is that combinatorial optimization deals with problems of this form we're typically interested in minimizing a function where the decision variables are going to be typically binary in the remaining part of this lecture we

will be focusing on quadratic problems but what I will say will also generalize to higher order problems where we have more than quadratic terms and finally keep in mind that if you have inequality constraints or equality constraints there are ways to deal with this as well so as you saw in the previous part we discussed a little bit about combinatorial optimization and what these type of problems might look like in this next part I will show you how to convert from a quadratic unconstrained binary optimization to an ising hamiltonian and how the ground state of

that ising hamiltonian is then the solution to the problem that we are seeking to solve so what exactly is a kuo problem a kuo problem is a quadratic cost function it has no variable constraints because we converted all of those into the quadratic part It's Made of binary optimization variable so variables that take on the value zero or one and it's a cost function that we want to minimize namely if you take a look at the equation here this is exactly what we are trying to to minimize where the Q here is a matrix of

size n * n where n is the number of decision variables that I have so let's take again a look at Max cut as an example of a Kubo we have the nod and we have the edges and the goal is to partition this group into two such that the number of edges connecting the nodes from the different groups is maximized so this very naturally leads to bin optimization variables 0o and one which indicate which node or which size of the cut sorry on which side of the cut that the node is on now if

you look at the graph this can also be a weighted graph where we assigned an edge weight to the graphs and if I take for instance the graph shown here where these numbers indicate the weight of the graphs then I can cast this graph into a matrix as shown here so for instance this number one here indicates the edge of uh the nodes between Z and one here right similarly this number here four corresponds to the edge between node one and node four so how do we go from this to a Max cut formulation what

we need to do is we need to write down the objective function for the the max cut and this can look like this what we want to do is we want to calculate the value of a cut right and this is shown by this term here for instance if x i is zero and XJ is zero then this whole term here sums to zero right however if x i is 1 and XJ is z then I have a contribution from that edge shown by the weights and this situation that I just described here corresponds to

the case where node XI is on one side of the cut and where node XJ is on the other side of the cut and so this cost function that we have describes very nicely the weight of uh the cut that this partition that we creates traverses so moving forward this then allows us to express the Matrix here of the Kubo as shown here where entry IJ of this Matrix is simply the negative of the weight of the edges and then the cost function that we're interested in is the one that you can see here where

we have both the linear part and the quadratic part so how do we solve this with a quantum computer the whole trick and the whole goal is to encode this Max cut problem into the ground state of a hamiltonian meaning that we need to convert this cost function into a hamiltonian and then the state that we are looking for is the one that minimizes the energy right so remember here that the energy is simply the expectation value of the hamiltonian over the state that we are considering and basically this hamiltonian encodes our cost function the

hamiltonian that we want to consider is one that is diagonal and each entry on the diagonal has the cost function for that particular basis State and so this is what this equation here is showing you and basically if we manage to do this encoding this basically means that finding the ground state of this hamiltonian will also be the state which has the lowest energy and therefore that minimizes the cost function and so basically now we have a hamiltonian formulation of uh this problem so we need to find a mapping from cost function to hamiltoni and

the way that we do this is through the following change of variables we have binary decision variables x i and we need to introduce spin like variables that take on the value minus one or one and we do the following change of variables as you can see here and then what we do is we promote these new decision variables zi to spin operators as you can see here and then when I do this and I go through the math that you can see here on this slide I will basically arrive at an ising formulation of

the cost function right and typically what this will do or what we will be need to do is to minimize these type of uh terms say expectation values in the hamiltonian so quickly to summarize in combinatorial optimization we are often dealing with Kubo problems and as we saw on the previous slide we can map these kubal problems to ising hamiltonians such that the ground state of that hamiltonian is the solution to the problem that we are looking for and typically the way that this is done as we just said on the previous slide is we

go from binary variables to spin-like variables that take on the value minus one and one and then whenever we have these spin-like variables in our decision problem we promote them the PO Z matrices and this then basically gives us the hamiltonian that corresponds to the problem and now what we will need to do is is to find the ground state of that Hamilton so now in this part we will start discussing some of the algorithms that we have or methods that we have to find the ground states of these hamiltonians and in this third part

I will be talking about the adiabatic theorem and adiabatic analing so the idea behind the adiabatic theorem is the following if I start the system or the Hamilton on in its ground state and then I perturb the system slowly enough then it will remain in its ground state and this is what you can sort of see here right if I start from this particular potential here and then I slowly change it and introduce a few extra Kinks into it then the system or the state of the system corresponding to this dot here will always stay

into a state of minimum energy and this is the guiding principle behind a lot of what I will be showing in the next slides so let's take a look at adiabetic analing to be specific we're going to consider a hamiltonian that has this form here I have some quadratic terms here z i ZJ they're might be weights in front there may not be weights for the sake of this example here we'll consider an unweighted problem and then we have this transverse field applied to it and in this particular setting I can vary the strength of

this transverse field and the strength of these quadratic TS now the hamiltonian that is made of only the X terms this one here is the initial hamiltonian and it's the initial hamiltonian because it's easy to prepare the ground state of this hamiltonian and the ground state of this hamiltonian is simply the equal superposition of all states that uh the cubits can take namely shown by this here next the second part here is what I will be referring to as the problem hamiltonian and this may have additional terms it may have a bit more complexity but

this is the hamiltonian whose ground state we are really interested in now the parameters here A and B are tunable these correspond to let's say these annealing schedules that I will be working with and the way that this works is that we start off in a setting where the tunable parameter a is strongest at time T equals z and the other one B is simply put zero this means that I am in this hamiltonian at time zero and I conveniently start in the ground state of that hamiltonian and then what we do in adiabatic analing

is that I will slowly and progressively ramp down this turn term here in front of the mixing hamiltonian as we call it and I will ramp up this particular term here in front of the problem Hamilton and if I do this slowly enough then surely enough I will end up in the ground state of the problem hamiltonian and then I can simply sample from this state and the samples that will come out of this will correspond to the solutions of the common nooral optimization problem that I'm interested in very good so I stated here the

goal is to perform this ramping up and ramping down slowly enough as per guidance of the adiabatic theorem right so now we can ask ourselves the question and why does this transition need to be slow and how slow does this ramping or this schedule need to be now this has a lot to do with the energy structure of the instantaneous hamiltonian and as we go through this analing here so remember that in our system there is not only the ground state but we have all sorts of other states that exist and this is shown here

we have the ground state e0 and then we have the first excited state E1 the second excited state E2 and so on and so forth and the thing that is going to be driving let's say transitions between states is the energy gap between them and so at this critical point here where the ground state is closest to the first excited state this is what we call the minimum Gap and in some cases this Gap can become very small and we now now that due to Quantum fluctuations if we don't make the an kneeling slow enough

then the system will jump from the ground state to the first excited state and then potentially also to other States right so we need to make sure that we go through this Gap slowly enough so as not to induce transitions now what makes this complicated is that in complex optimization problems this Gap can become very slow and this means that the analing schedule needs to be very slow or rather take a lot of time meaning that you have a large runtime for the protocol so let's maybe try and gain a little bit more insight into

this and look at a very simple setting and here I think it's very nice to go and take a look at the land Zena example this is a very simple hamiltonian it has two states and the hamiltonian consists of two terms an X term and a z term in front of the Z term we have a field which varies in time and interestingly enough here this Delta X term will give us the energy gap between the ground state shown here in blue and the first excited state shown here in red now we're going to make

sure that at uh minus infinity time we start off in the ground state so somewhere over here and then we are going to assume that we have a linear ramp on this field Epsilon of T and you can go through the uh Landa Theory and what this is going to tell you is that you are going to be in the excited state at T equals infinity with a probability given by this expression below is the exponential of - pi Delta 2/ 2 * the velocity and basically what we learn from this is that if we

want to avoid a transition from the ground state to the first excited state then we need to have slow schedules and in particular these schedules need to be slow with respect to this Gap so this is a very nice contained uh example of course when we're solving a combinatorial optimization problem we have many more levels around but nevertheless this type of insight here provides us with some guidance as to why the schedule should be slow compared to the Gap with this the we are at the end of the third part and what I hope to

have communicated to you here is that with adiabetic Quantum Computing we can find the ground state of a Hiltonia the way that with this works is that we start off in the ground state of another hamiltonian that is easy to prepare and then we progressively evolve towards the hamiltonian that we are interested in namely the hamiltonian that encodes the combinat oal optimization problem that we want to solve there is however caveat in that as the gap between the ground state and the first excited state decreases the analing schedule needs to be longer and this then

slows down the time to solution so in this part now I want to use The Insider knowledge that we gained on adiabetic analing to introduce the quantum approx optimization algorithm and this is really one of the working horse algorithms that we tend to use on a lot of our superconducting Cubit Hardware so this is an algorithm that was introduced in 2014 by Edward farry Jeffrey Goldstone and Sam Gutman now this is an algorithm designed to solve combinatorial optimization problems here I've written that it can solve Kubo problems but it can also deal with higher order

problems as well for those of you familiar with vqe qaa is in fact a variational algorithm and can be considered a special case of vqe and what you'll see here is that it's basically a variational form which is based on terization and Quantum tic analing with a classical optimization loop on top of it so this is what I would like to now describe so let's take a look at time evolution as we all know the time evolution of a hamiltonian system is described by the Shing equation as you can see here now the way that

states evolve under this equation is based on this here where s of T is equal to a unitary Matrix times the state at times zero now in a Time independent uh formulation this unitary Matrix is simply the exponential of the hamiltonian time time with this minus I prefactor here when the hamiltonian is time dependent we have to take that into account and basically what we now have is a Time ordered exponential here note that this should also be a minus sign in front of this I as shown here nevertheless what we are going to do

here is we are going to toriz this hamiltonian and I want to explain to you exactly what that means as you saw in the previous example we had two hamiltonians in some sense that we added with time varying prefactors the first one being the mixer hamiltonian here and the second one being the cost function hamiltonian and typically these two hamiltonians do not commute with each other so what this means is that we if we want to depict this analing schedule on a digital quantum computer we need to discretize it and the way that this is

done is as follows what we do is we take this integral of the hamiltonian express it as a sum of a small time intervals shown here and then we have in essence multiple time independent Shing your equations and we know the solution to that it's the exponential of the hamiltonian and we can simply string the together in order to get the time evolved State now keep in mind that this is an approximation of the continuous time Evolution simply because if I take the exponential of the sum of two matrices that do not commute with each

other then this exponential is not the product of the exponentials right as shown here and in effect what the structurization does is that it introduces an approximation and the magnitude of this approximation here is proportional to the square of the time step right so keep in mind here that these matrices A and B they are simply related to the mixer hamiltonian hm and the cost function hamiltonian HC and this basically allows us to express a digital version of a kneeling and this is done here you can see that we alternate between layers of the cost

function hamiltonian and the mixer so we have a first layer here then a second layer and so on and so forth and this then allows us to arrive at qaa what qaoa does is it's going to create Quantum circuits for these different parts here for the exponential of the mixer and for the exponential of the cost function and it's going to go one step forward further it's going to replace the magnitude of these time steps by parameters that will be left to optimize right and so basically now the circuit that I am applying is the

one shown here and you can see that the first time step here has been replaced by an optimization parameter Gamma 1 that a classical optimization routine will then try and optimize and this is where you can see that this qaoa algorithm resembles very much a variational Quantum igon solver and that it is a special case of it and you can also see as going forward here we will have the parameter from front of the mixer which is left to optimize and so on and so forth and we will say that a QA circuit has P

layers corresponding to the number of times that I will be applying this block of cost function and mixer now note also this part of the quantum circuit this is simply a hadamard gate on all cubits to prepare an equal superposition of states and now keep in mind from a few other parts we want to start off in the ground state of the mixer and this is exactly what this block of gates is doing and then finally at the end here we measure the state of the cubits which allows us to draw samples and these samples

will correspond to candidate Solutions of the problem that we want to solve and there will be a little bit of back and forth between the quantum and the classical in the sense that we want to try and find these parameters gamma and beta that allow us to minimize the energy so simply put and just to repeat it the steps are shown here we Define the mix of hamiltonian as the sum of p x operators and the cost function hamiltonian Following part two that I discussed before such as this cost function hamiltonian includes the classical cost

function that we want to minimize we then apply hadam mod Gates here to prepare the initial state of the mixer this is shown by this part and then we apply P layers here each consisting of two terms one being the mixer here and the other one being the cost function Hamilton and then at the end we measure and then the variational approach here comes in because we want to try and minimize the energy of the cost function hamiltonian and we do this by trying to tune these parameters gamma and Beta And so this is where

the back and forth happens between the quantum computer and the classical computer and so you can really see this qaoa as in some sense a toriz adiabatic schedule with an extra optimization loop on it for instance if you already take a fairly deep qaa you could initialize it with something that looks like your kneeling schedule and then what the classical optimization will do is fine-tune a little bit these parameters gamma and beta in order to try and optimize let's say the energy of the hamoni and what is very nice about this is that the circuit

depth depends on P here the number of layers and so if the hardware is fairly noisy you can then decide that you might try and operate at a lower PE and then try and arrive at still producing good samples by maybe investing more effort into the classical optimization of the parameters gamma and beta and then of course as you would make p larger so make the circuits longer and deeper then this whole algorithm will tend to assemble more and more the nearing and this is what is very nice about it is that in the limit

of infinite p and let's say noiseless Hardware you are guaranteed to converge to the ground state of your optimization problem right but of course we want to do things in a smaller number of layers than infinite number of layers so let's go for a quick summary of qaa it is a discretized version of adiabatic analing which allows us to implement it on digital quantum computers now it's also a variational algorithm it takes a lot of inspiration from the variational approach to optimize these parameters and this allows us to reduce let's say the runtime of the

algorithm in terms of the length of the circuit strictly speaking compared to a kneeling and for these reasons it has the ability to outperform utic analing in terms of uh run time when you look at the length of the circuit and furthermore it has the possibility to include diabetic transitions which may help overcome let's say some of these transitions to higher excited States now some of the limitations of qaoa is that it does not come with any performance guarantees outside of the fact that in the limit of large P you will converge to the ground

state in some very narrow conditions you can derive performance guarantees for instance on three regular graphs you can compute the approximation ratio that you would get with depth one and depth two qaa but in general if you want to outperform classical solvers a high depth is going to be needed and furthermore and this is going to be a limitation with respect to the hardware is that you need to implement this complex cost function and to do that you will typically have to overcome the limited connectivity on the superc conducting Cubit devices and this is something

that I'll discuss a little bit more in one of the following parts and finally it is typically needed that you have to take a large number of shots and iterations in this variational approach as the problem increases and this is also why a lot of research nowadays is focusing on finding charistics and methods to derive optimal parameters gamma and beta so that you don't need to do this classical optimization between the quantum um and the classical computer in a closed loop fashion where the idea here would be simply that you take a circuit with good

values of gamas and betas and then simply sample from that circuit to arrive at good Solutions of the problem so in this part what I want to do is to talk about some of the extensions of the qao algorithm that we saw in the previous part and namely I want to present two research papers here dating back to 2020 and 2021 the first one is recursive qaa and the second one is warm start qaa and what I hope to do here is to show you some twists and some developments that we can play on top

of standard qaa in order to improve the algorithm or maybe yeah get let's say better performance guarantees out of it and so without further Ado I will start with the recursive qaa which was a paper by brai published in physical review letters now the idea is that we want to apply qaa multiple times and at each step we will reduce the number of variables in our problem by one so we start off with an N variable problem as shown here and then we apply qaoa to it and we will get a certain State here right

then what we do is we perform a variable reduction and the way that with this works is the following I will sample from this qaa state and I will try and evaluate the correlation Matrix between the different correlators as shown here and then the two decision variables z i and ZJ that have the largest correlation in magnitude this could be a positive correlation it could also be a negative correlation but it's the largest one in magnitude I will take take this one and use it to perform a replacement of the variables namely what I will

do is I will substitute variable zi with variable ZJ times the sign of this largest in magnitude correlation and by doing this this allows me to go from a decision variable with n problems or rather problem with n decision variables to a problem with n minus one decision variables and then basically we simply repeat if the problem is not small enough then we apply this form of qaoa and variable reduction once again and this is why this is called recursive QA simply because I will be iterating this variable reduction in qaa multiple times until I'm

satisfied with the size of the problem and what this means is that when this problem has become small enough I should then be able to deal with it with other methods and so basically the output from this rqa is a list of substitutions and these are the substitutions that are examplified here for instance perhaps variable one might be negatively correlated with variable five so for instance if variable one were to be zero variable five could be a one or variable two and three they could be positively correlated namely that if Z2 would be one Z3

would be also one for instance so let's take a look at an example of this I'm going to go back to the max cut problem that uh I introduced before simply because it's a very nice uh example to illustrate and uh teach let's say optimization so we have this graph with these five different nodes labeled from zero to four and and I could for instance create an optimized State at P equals 1 this might be the way that the counts look like after qaa circuit that I have optimized and based on this I can compute

the correlation Matrix between the different uh decision variables and you will see here that this is what comes out basically and if you look closely you will see in fact that this correlation Matrix already represents some of the structure of the problem right you can see here that I have a negative correlation between decision variables that are connected by an edge right for instance Z and one here is shown by this particular entry Z and two would be this one right so we can already see here a little bit the structure of the problem in

this correlation Matrix now I pick the largest one here I can break ties in the way that I see fit and then I will substitute decision variable Z with decision variable one since they are negatively correlated I will then assume that z0 is in fact minus Z1 as shown here and this then basically defines a new problem and I will iterate through this multiple times so why why is R qaoa so interesting well if you go and read the paper by Serge bravi you will find the following interesting information in it you can find a

certain class of problems with a hamiltonian shown here so this is valid for integers divisible by six and these classes of hamiltonians they satisfy the following properties there is a local classical algorithm that gives you an approximation ratio of one for the max cut on these hamiltonians very well now if you apply standard level P qaa to these hamiltonians you can show that you arrive at an approximation ratio of at most P / by P + 1 now note that this here is smaller than one right however if you apply level one recursive qaoa you

can prove that you will get out of this scheme an approximation ratio of one and so basically what this is telling you is that there is a class of problems where recursive qaoa will outperform standard qaoa and this is basically the justification behind recursive qaa now with that I would like to move on to another enhancement of qaa that we consider namely warm starting QA and this is something that is described in this paper here the idea behind this paper is that we want to leverage relaxations and this is a trick that is often employed

in classical Computing as well to solve these problems namely what I will do is I will take a combinatorial optimization problem as shown here and I will relax the binary variables into continuous ones and when I do that I arrive at a smooth optimization landscape here which can be efficiently solved on a classical computer and this holds if this Matrix Q is positive semi-definite now let's see how this works in practice the way that it works is the following I have taken my binary variables and relaxed them into Contin ones and solved the corresponding problem

what this is going to give is it is going to give solutions to the relaxation where these Solutions are now continuous within the range zero and one and what I will do is I will prepare a state in the quantum computer which corresponds to this continuous relaxation and this is shown here you can see now that the mixer is no longer a set of hatod gates but it is in fact our yre rotations parameterized by the results of this relaxation shown here now you might remember that okay if I do this if I change the

initial State I also need to change the mixer and this is what we do here in this work we derived a form of the mixer such that the initial State here is also the ground state of this new mixer and you can see here that this new mixer is parameterized by the same beta as it was before but in addition we also have the Theta parameters shown here and these Theta parameters are obtained from the classical relaxation of the binary problem to The Continuous one and so the circuit that we are seeing here is basically

the essence if you will of uh warst qaa so there is uh a few things to keep in mind let's say that if these results from the relaxation that we get are in the interval 0 to one excluding the points 0 and one then the initial state will have some overlap with the optimal solution this is guaranteed and basically then through the adiabatic theorem this means that that if I imply if I apply an infinite number of layers I will converge to the ground state of the cost function hamiltonian that is very nice there is

one caveat though is that if this C star I here if the result of the relaxation is either zero or one then I might end up in a situation where there is no overlap between the initial State and the ground state of the cost function hamiltonian and the way that we remedy this is by introducing this Epsilon parameter here this regularization which basically projects the edges of this rectangle here into the continuous space right so for instance if the continuous relaxation of the variables X1 and x0 would lie within this rectangle nothing is changed but

if they were to lie here on the edge I would in instead project them over here so that we then have some overlap with the ground state of the cost function hamon so let's take a look at war. QA in practice for this we will be considering the mark of its portfolio optimization that I introduced in the first part we will be seeking to maximize our return and minimize our risk under certain budget constraints as shown here we'll look at a fairly simple problem one with six assets and we will compare standard QA to warm

start QA and you can see here in Orange the probability of sampling the optimal solution with standard qaa and if I use warmart qaa this prob ability jumps quite drastically and this is because the relaxation The Continuous relaxation of the problem is able to bias me towards good States what is also interesting here is that if you look at the energy you will see that the energy is smaller for shallower depths meaning that it also makes it easier to implement on the hardware because in practice I would need less layers A P to implement and

then you can also do a senery check and this is what is shown here by these blue curves what we did here is we started off the system in the continuous relaxation but we did not adjust the mixer and this is what we mean here by normal mixer and you can see that if you don't do this the results are really not that good the energy doesn't improve to lower values as it does for the proper worm start and the standard QA and this really shows a little bit the interplay between the fact that the

initial solution should be the ground state of the mixer that you are using and so in summary what we've done in this part is that I have introduced recursive qaa where we apply qaa multiple times to reduce the size of the problem and this has better performance on certain classes of hamiltonians and then in the second uh part here we discussed the warm start qaa where the intent is to leverage relaxations and randomized roundings and what is nice about this is that this can inherit some of the performance guarantees of the classical approaches and so

with this I'm at the end of uh this part so in this part here what I want to do is I want to discuss some aspects of executing qaa on superconducting cubits we won't go into too many details but I want to point out some of the key steps that one needs to pay attention to when running these experiments and maybe you might have heard about this but one model of execution that we like is the kcit pattern and the kcit patn they have essentially four steps the first one is the problem modeling taking let's

say your classical Comin torial optimization problem and then modeling it in a fashion that is am meable to a quantum computer and basically what this means is creating Quantum circuits and as you've been paying attention you know very well that this corresponds to creating Quantom circuits like those of qaa or warm start qaoa or so on and then a step is the hardware execution right okay so once we have the circuits we also need to optimize them that's a very key point that I will actually dwell upon a bit more in the following slides now

once we have our optimized Quantum circuits we need to execute them on the hardware and this for all intended purposes typically means sampling from these Quantum circuits at the end and then finally once you have those samples you want to postprocess them in order to arrive at solutions to your problem right so this could correspond to the variable reduction step in recurs qaa or simply just taking the sample bit strings and interpreting them as solutions to the combinatorial optimization problem so we already know about problem modeling but here I want to emphasize one point that

we haven't touched upon too much up until now when you look at a combinatorial optimization problem there are often many different ways to model it for instance you could choose to model it as as a Kubo or maybe you could model it in some other fashion for instance maybe the problem is natively a higher order problem like for instance the labs problem which has fourth order terms you could decide to either keep that fourth order structure in the problem or to introduce for instance extra variables in order to reduce it to a Kubo formulation and

so keep in mind here that often problem modeling is done in such a way to make the problem at least somewhat amable to solvers be it classical solvers or Quantum solvers now let's say that we have modeled our problem uh following a Kubo and that we have derived Quantum circuits for it we now have an idea of what these qaoa circuits might look like we start in an initial combination of all states and then we apply the cost function hamiltonian and the mixer and so on but there is a lot of things that then need

to happen to have a good execution on the hardware the first thing that we need to do is we need to select good cubits those would be cubits that are not too noisy that are not plagued by let's say um two level systems or noise that have good coherence times and so on so we want to select a good physical set of cubits and furthermore we have one extra degree of fum we do not need to have decision variable zero correspond to Cubit zero in fact we can permute this ordering of decision variables to um

cubits and this is what I refer to as the initial mapping and there's many different ways to do this you have for instance the saber algorithm that has its own initial mapping and this is a heuristic one it works in any situation right but you can also have problem tailored initial mapping like the SAT mapping which is described in a paper by matsa and so by leveraging this initial mapping we can often make the circuit somewhat easier to implement on the hardware now another thing to keep in mind while dealing with superc conducting Cubit Hardware

is that the connectivity between the cubits is limited you might remember an older backend that we had here IBM Guadalupe this was a 16 Cubit machine and you can see here the coupling map of this device so what this means is that each node is a cubit and each Edge corresponds to a two Cubit gate that I can natively apply in the hardware now in practice many optimization problems that we are interested in are dense and this is shown here on uh the left you can see that I have a 16 decision variable problem which

requires couplings between all of the cubits and so there is a very big difference in connectivity between the problem that I want to implement and what the hardware can actually do and this is where routing comes into play we need to apply swap gates in order to bring distant cubits close to each other so that we can actually implement the cost function hamiltonian and you need to do this in a fairly intelligent way in order to be able to reduce the overall depth of the circuit now what is very nice in combinatorial optimization is that

the hamiltonian is diagonal it's made of p z terms and basically this means that they all commute and so we can have a little bit of freedom in terms of which terms we apply first and we can choose these terms in such a way that it makes the circuit somewhat less deep right so for instance by applying maybe I don't know terms between cubits 10 and 12 in parallel with the ones between 2 and three right and so this will lead to somewhat more Compact and shallower Quantum circuits and then another aspect to consider when

optimizing the circuits is to employ methods of noise suppression for instance dynamical decoupling this is a well-known method it definitely predates the reference that I have given here but this is sort of a nice modern summary in some sense dynamical decoupling is going to find the idle times in the quantum circuit and insert sequences of single Cubit gates in order to try and decouple the system cubits from the environment to which they're exposed and this basically allows us to suppress some of the noise present in the quantum circuit now depending on the hardware and if

we have access to either fractional Gates or the pulse level we might be able to do further optimizations by which we scale for instance the cross resonance gates in order to arrive at even more compact schedules to execute on the hardware so keep in mind that circuit optimization can involve a lot of steps like initial mapping swap routing noise suppression and so on then on the hardware execution typically we want to choose between a sampler or an estimator primitive now typically in optimization tasks we are interested in samples where we want to sample can cidate

solutions from the quantum circuits and this is simply because we want to find the configuration of the decision variables that allows us to minimize this cost function but that's not necessarily always the case right if you took a look at the recursive QA algorithm that I mentioned in one of the previous Parts there we actually only need to evaluate the correlators and this is now a situation where we are dealing with expectation values and so basically here we would be interested in using the estimator primitive for recursive qaa now another thing that could also maybe

guide our choice between the sampler or the estimator is what kind of aggregation do we want to use typically when you look at an expectation value you are taking an average of the samples but there are different ways of aggregating the samples that we could Leverage there is for instance the conditional value of risk here which basically tells us to only consider the alphab best fraction of the samples so for instance for each sample that I take I can run it through my cost function and then I can order them in terms of how well

they did with respect to this cost function and then I only consider the alphab best fraction of these and in some of the recent work we have also shown that the Sia has some nice properties in terms of helping the optimization and get noisess expectation values and then finally you need to let's say postprocess the samples that you have drawn from the hardware right so for instance in standard qaa we would simply interpret the samples that we measured as Solutions or candidate solutions to the optimization problem in recursive qaa we need to do a little

bit more leg work and we need to actually aggregate these samples into expectation values and correlators so that we can decide which variables we want to substitute so this was in a nutshell some of the things that you need to consider when executing on hardware and what I want to do with this slide here is to give a bit of an example of some of the recent work that has been done um on superc conducting Cubit Hardware investigating optimization tasks right and here I've picked out candidate examples which run on typically a fairly large number

of cubits so for instance in this work here by Samantha Baron and U her collaborators we were looking at let's say expect or sort of ways that the sea can help arrive at um let's say Noise free expectation values right then as quantum computer scale another question is how do you Benchmark these and what is discussed here in the work of Mis and his collaborators is a way to look at Universal scaling laws like the kibo zerich mechanism and use deviations from that Universal scaling in order to Benchmark quantum computers another nice example here of

large scale optimization on the hardware is is looking at higher order ising problems and more specifically third order terms and this was done with a fairly large number of cubits here with up to depth to qaoa right so I find this particularly interesting actually because when I was doing my PhD the state-of-the-art was maybe one or two two Cubit Gates on a quantum computer with a few cubits and this was typically done in a university lab and I find it actually truly amazing that we can start investigating these algorithms on noisy Hardware but with a

fairly large number of cubits and Gates now in the three previous papers that uh I mentioned the topology that was used was the hardware native heavy hexagonal topology of the hardware this is nice for exploring things in certain conditions But ultimately I think the end goal of quantum combinatorial optimization would be to represent any arbitrary connectivity on the hardware and this was done in this paper by stefanak where we looked at random three regular graphs and we did this on uh depth 2 qaa up to 40 cubits and you can see that the quantum circuits

that this generates they are starting to become fairly deep and also fairly wide and it's already interesting to see that you can still pick out some signal for problems that are this large and then another recent paper that uh appeared this month on the archive was investigating qao protocols on up to 105 cubits and here what is interesting is the Earth is really pushed into the domain of very large p and QA and try to take a look at how the approximation ratio here scales in terms of the depth of the circuit right so keeping

in mind that deeper circuits will have more noise and so crucially I think the main point here is that utility scale Hardware will enable Research into these algorithms and help us make progress in this particular domain so with that I'm arriving at the end of uh this course I hope you found it interesting I want to quickly summarize in a nutshell what we learned we talked about combinatorial optimization and how to formulate these problems in terms of the ground state problem of a hamiltonian and then we looked at agatic Computing and how to derive the

qaa from it right and then we also took a look at some of the extensions of qaa and certain considerations that one has to keep in mind when running on the hardware and so with this I would actually like to end with some concluding remarks here many optimization problems are NP hard now we have very good classical methods to deal with them and these classical methods are heuristic they don't necessarily come with performance guarantees and you can probably always find some worst case instances where these classical approaches will fail and I think the situation is

similar with Quantum approaches for Comal optimization the qao way is a heuristic algorithm and this means we must explore it and its variance practically on instances that the community deems to be valuable and here we really want to try and find let's say problems where classical heris might struggle and see if we can improve upon them with the quantum heris that we have and with that I'm at the end of this lecture I really hope that you enjoyed it thank you very much much