Research

Figure 1: The infamous Wow! Signal, discovered by radio astronomers in 1977. The signal is represented as a continuous time series (in coral), and as a discrete time series (in blue).

A time series carries important information in the form of transition probabilities. For example, imagine we measure the ambient temperature on a daily basis. One day we measure a temperature of 17° C, and we want to know the probability that the temperature will be 19° C three days later. We refer to this quantity as the transition probability from 17 to 19 in 3 steps. More abstractly, given a natural number k and two data points i and j, we may ask what is the probability of transitioning from i to j in k steps (e.g., in k units of time). This quantity will be denoted Tij(k). In our temperature example, i=17, j=19, and k=3.

Figure 2: Transition probabilities of discretized time series using AAPL stock's return at various kᵗʰ time steps. Each image represents the elements of transition matrix with yellow and purple points transition probabilities of 0 and 1, respectively.

Multidimensional time series, where multiple values are measured in time, may exhibit some additional interesting properties. For example, a two-dimensional time series could consist of two different stock prices as a function of time. In addition to the transition probabilities (exhibited for each time series individually), the two time series of stock prices may exhibit certain positive or negative correlations. Being able to identify and quantify connections between the values of different stocks can provide critical insights into their joint behaviour.

The last ingredient to motivate our work here is the concept of synthetic time series generation. For example, if a data scientist is unable to access sufficient data or sensitive time series data cannot be shared with them due to privacy issues, the data scientist might need synthetic time series in order to perform adequate analysis. Synthetic time series data should preserve properties from the original time series - including transition probabilities and multidimensional correlations - but should be different enough that they present as new time series.

Generative Adversarial Networks (GANs) and their quantum analog (qGANs) have previously fallen short of generating synthetic time series. Broadly speaking, modeling can range from generalized machine learning models which are broadly descriptive to deterministic learning models where computations can be carried out exactly. Deterministic models form the basis of physics and engineering, where the goal is to mathematically model various physical realities. For example, many exotic phenomena like black hole mergers (Zilhão et al. 2012) and RNA folding (Pincus et al. 2008) are modeled using various mathematical formalisms like algebraic tensor theories and stochastic equations. Development of these models often involves months, if not years, of effort and expertise in the relevant domain. And, more often than not, they fail to capture the intricate complexities of reality.

On the other hand, the explosive progress in computational resources has made it possible to train a computer to develop more and more sophisticated and expansive models. The last decade of machine learning research has shown the sheer success of models learned computationally from empirical data rather than developed analytically from underlying physical and mathematical principles. For example, at the time of writing this blog, no simple analytical model could pinpoint the exact position of a dog in a given image, while a neural network with reasonable computational power would take just minutes to do the same.

A considerable amount of work has been dedicated to bringing together learned and analytical models. This can be accomplished by either (a) bringing machine learning insights into the analytical model, or by (b) adding analytical constraints to the learned model. Both strategies have been successfully applied in various places. This is where quantum Hamiltonian learning methods kick in. More specifically, let's try to understand analytical and learning models in the case of time series data.

Given a time series, there exist a plethora of analytical modeling tools, from AutoRegressive Integrated Moving Average (ARIMA) models to Markov techniques (Bishop et al. 2007). On the learning side, one can pick a complete Deep Neural Network (DNN) to infer fundamental characteristics of a time series. A classic example of adding more analytical constraints into a learned time series model is the seminal work by Rumelhart et al. (1986), where the concept of Recurrent Neural Network (RNN) was first proposed. We introduce analytical constraints on a learning model based on quantum mechanics. Broadly, we want to constrain our learning space to a set of quantum Hamiltonians, or energy operators, that would closest replicate the characteristics of a given time series.

We start by translating the problem into quantum-friendly language. With adequate preprocessing, we can change each data point in the time series into basis state |i> for some i in {0,...,2n-1} and an adequate n, or number of states into which our time series will be reduced. Now that we have our quantum time series, how can we obtain our transition probabilities? As the exact formalism is a bit technical, we may sacrifice some rigor for the sake of seeing the bigger picture (if you're not familiar with the terminology that follow, you may comfortably allow yourself to skip to the next paragraph).

Figure 3: (left) The Wow! signal is reduced from a continuous time series (violet) into a discrete time series (black) using the Symbolic Aggregate approXimation (SAX) algorithm. (right) Transition matrix showing the probabilities of each quantum state transitioning to the corresponding quantum state on the other axis.


The space where our data points - our quantum basis states - live is our system. For each particular unit of time, k, the k-step transition probabilities can be found via a quantum channel. Thanks to Stinespring's dilation theorem (Paulsen et al. 2003), each quantum channel can be obtained from a unitary operator acting on a larger space, consisting of both our original system and an additional "environment". By imposing further uniformity and coherence requirements on the underlying unitaries and environments, we find ourselves in a setting where a unitary acting on the system combined with the environment induces the transition probabilities of each individual time series in a multidimensional time series in a nice and uniform way. We refer to such series as K-Coherent, where K is the set of time steps in question.

So how do we achieve this goal? We construct a quantum circuit where the transition probabilities of the input time series are used in our training process to optimize the parameters of the circuit (Horowitz 2022). This eventually results in a unitary operator inducing transition probabilities that are close to the original values

Figure 4: Covalent directed acyclic graph showing the flow of computations in generating synthetic time series.

A key component of our learning process is the Symbolic Aggregate approXimation (SAX) algorithm. The SAX algorithm is commonly used in machine learning in order to reduce the size of time series data. An example of the SAX algorithm as applied to the Wow! Signal is shown in Figure 3. An input continuous time series (the violet curve in Figure 3) is reduced to a collection of discrete points (black diamonds in Figure 3), separable into quantum states (y-axis of Figure 3). In our quantum time series analysis, a continuous time series is reduced to two quantum states using the SAX algorithm.

We assemble our algorithm using Covalent, our own open-source Pythonic workflow management tool. Covalent enables users to break their experiments down into manageable tasks that come together into a final workflow. The Covalent user interface allows users to visualize how each task fits into the workflow, as well as track the input and output of each task. Covalent tasks can be executed locally, on the cloud, a supercomputer, or a quantum computer. This makes it easy to construct quantum-hybrid experiments, where only specific tasks are run on quantum hardware. Using Covalent, our learning process can be broken down into discrete steps, where each step is created by defining a Python function with a Covalent electron decorator.

We used the following 9 functions in our learning process:

  • Generate Time Series: A function to generate synthetic time series by solving a known stochastic differential equation
  • Calculate δxi: Calculating the first order difference series for having a stationary series
  • Discretize 2N: Using the Symbolic Aggregate approXimation (SAX) algorithm, we discretize the stationary series
  • Find the Transition Probabilities Tij(k): Calculating the transition matrix Tij, the probability for i state to go j state in k time steps
  • ⚛︎ Train Q Network: Iterative optimization requiring a quantum computer call to calculate the loss function and a classical computer call to run a minimization algorithm
  • ⚛︎ Generate New δxi Values: Using the trained network to generate new first order sequence by having quantum computer calls for each time step and repeating it multiple times
  • Calculate Mean and Variance: Calculate the mean an standard deviation of first order difference at various time steps
  • Calculate ∫δx_i → xi: Integrate the first order difference to get back the original time series
  • Plot and Compare: Plot the obtained time series along with the main time series

The steps with a ⚛︎ symbol are run on a quantum computer.

The Results

Figure 5: The original input 2-dimensional time series (plotted in red), and quantum-generated time series (plotted in blue). The solid blue line indicates the mean value of the quantum-generated time series, while the shaded blue region indicates the variance.

We ran our experiments on IonQ's 11-qubit trapped-ion quantum computer. Figure 5 shows both the input 2-dimensional time series data in red and the average quantum-generated time series plotted in blue. You can see that certain features, such as the negative slope, are preserved in the quantum-generated time series. In Figure 6, you can see that the correlation in the input time series data (plotted in red) and the correlation in the mean quantum-generated time series data (plotted in blue) are very similar.

The skeptical reader may rightly ask if we could have done the same thing using a classical computer. While we don't have proof that the answer is no, we do have some strong evidence suggesting that quantum computers might be particularly well-suited for such problems. We were able to show that finding a quantum process that generates the desired transition probabilities amounts to solving a rather complicated set of equations. In the most general case, this is known to be a computationally difficult task. Furthermore, we were able to experimentally observe that the process we learn exhibits a certain probabilistic property known as quantum non-Markovianity, which makes it very difficult to simulate using only a classical computer.

Figure 6: Correlation between the 2-dimensional time series of the given data (left) and mean of quantum generated series (right). Note the negative correlation feature being captured by the generated model.

The Future

Our work opens the door to new intriguing directions of research. Can we find stronger evidence for quantum advantage? What are the limits of our algorithm? Does our algorithm work for all possible multidimensional time series? These questions remain wide open, and we invite the intrigued reader to help solve them.

References

[1] Christopher M. Bishop and Nasser M. Nasrabadi. Pattern Recognition and Machine Learning. Journal of Electronic Imaging, 16(4):049901, January 2007.

[3] Vern Paulsen. Completely Bounded Maps and Operator Algebras. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2003.

[4] D. L. Pincus, S. S. Cho, C. Hyeon, and D. Thirumalai. Minimal models for proteins and RNA: From folding to function. arXiv e-prints, page arXiv:0808.3099, August 2008.

[5] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. , 323(6088):533–536, October 1986.

[6] Miguel Zilhão, Vitor Cardoso, Carlos Herdeiro, Luis Lehner, and Ulrich Sperhake. Collisions of charged black holes. ,85(12):124062, June 2012.

In February of 2022, Agnostiq released a study providing the present state-of-the-art for solving combinatorial optimization problems on real, noisy gate-model quantum computers. Although the study was primarily focused on the optimization of discrete financial portfolios, the work is much more general with consequences for other industrially important problems including vehicle routing, task scheduling and facility location services.

While this post encompasses broadly the the performance of different gate-model quantum processors for performing portfolio optimization tasks, for a greater level of detail, we recommend that the reader consult the full paper now available on arXiV.


A combinatorial optimization problem arises whenever we are presented with a number of choices and we task ourselves with selecting the best choice. One example surfaces from a problem often posed by delivery companies: “given a set of possible vehicle routes, which one permits the driver to deliver all parcels the fastest?” Another is asked in telecommunications: “given a set number of approved locations for broadcast antennas, which sites should be used to maximize the reach of mobile phone signal?” Lastly, and directly related to this post, a financial investor asks: “Given a list of stocks and a budget for how many different stocks can be bought, which assignment maximizes the amount of cash accumulated over time with the minimal amount of risk?”.

Although hotly debated, many believe that such problems will be best solved using near-future quantum processors; a much sought-after achievement of quantum supremacy for useful problems. All of the above companies now ask: “given present-generation noisy quantum computers, how far are we along the path to possible quantum supremacy?” 

While it is quite difficult to say if/when supremacy will be achieved, when judged by the quantum volume benchmark, it is clear that the general quality of qubits is improving. Given this improvement in general metrics, has application-specific performance kept up? While one may naively suggest that improvements in the former implies improvement for the latter, a recent paper in Nature has shown that general performance metrics like quantum volume overestimate the performance actually achieved in structured programs like quantum-optimization algorithms. In this post, we show explicitly that performance in discrete combinatorial optimization has improved and further expose the non-alignment of general and application specific performance. So, let’s benchmark some quantum computers; what are our options?

While there are a plethora of different ways to make a quantum computer, at the time of writing, there are two leading paradigms. One is using trapped ion qubits. Here, the two-level system of an individual qubit is realized through naturally occurring electronic states of an ion trapped/arranged in an electromagnetic field. Qubits are manipulated (i.e, their gates are implemented by) lasers incident on the ions. The other leading method is to use superconducting transmon qubits. Here, a kind of ‘artificial atom’ is created through the careful pairing of capacitors and Josephson junctions linked by superconducting wires. In this case, qubits can be manipulated using microwave resonators.

While each paradigm has its pros and cons, the only way to know which systems perform well for given applications is to test them. We did just that by performing portfolio optimization tasks on  superconducting transmon QPUs from IBM and Rigetti and a trapped ion QPU from IonQ. 

How did we judge the portfolio optimization performance for each QPU? Given a set of well known stocks in the tech space (GOOG, AMZN, FB, NVDA, TSLA etc.), we tasked each QPU with making the best choice of stocks to invest in given (i) historical market data and (ii) a fixed number of tickers. To understand how the performance of each QPU is judged, we need to first understand a feature of the underlying algorithm performing the optimization: The Quantum Approximate Optimization Algorithm (QAOA).

Broadly speaking, the QAOA samples answers to combinatorial optimization problems from a distribution biased towards sampling good solutions. Our metric - the rather formally named Normalized and Complementary Wasserstein Distance (NCWD) - measures how well the QPU was able to bias the distribution. Importantly, increasing one parameter - the number of layers p in the QAOA circuit - is known to improve the bias imparted by the QAOA. At odds with this is the requirement of a larger number of one and two qubit gates required to increase p. That is, since present QPUs are noise-prone, there comes a point of diminishing returns where any would-be increases in performance coming with larger p are stifled by the tide of noise which acts to degrade performance. Now, a natural question to ask is: at what point do these effects balance for the QPUs we benchmarked?

For regular QAOA (R-QAOA; the original formulation of the algorithm), this point occurs with a large peak in performance at p = 4 for 3 stocks on a IonQ’s 11 qubit trapped ion chip [Figure 1(a)]. For 2 stocks, increasing performance with p was observed up to p = 5 for superconducting chips [shown for Rigetti in Figure 1(b)]. Although no exactly comparable benchmarks have previously been performed, this amounts to a sizable improvement over the most related benchmarks in recent years. Indeed, just two short years ago, the original observation of increasing performance with p was landmark; using two qubits (equivalent to two stocks here) performance increased from p = 1 to 2 when studying a different optimization problem.

Clearly such an observation is now common throughout most hardware at larger depths and qubit counts. Even more recently, benchmarks on Google Sycamore revealed that problems native to the qubit topology (the map of which qubits can directly interact with each other; see the insets of Figure 1) showed increasing performance up to p = 3 using more than 10 qubits. However, one always has to read the small print. These problems were on average efficiently solvable using classical algorithms in stark contrast to discrete portfolio optimization which is not. The work on Google Sycamore did also study harder problems (see the supplement of the paper for the important details) but did not observe peak performance at p = 4 and 3 qubits as we have here.

Figure 1 - Performance of the portfolio optimization algorithm (as measured by the Normalized and Complementary Wasserstein distance; NCWD) versus the number of QAOA layers p. Qubit topologies/connectivities for each QPU are shown as insets. A performance score of 0.5 means the algorithm was indistinguishable from random guessing while a score of 1 means the best possible portfolio was sampled with 100% probability. (a) R-QAOA on the 11-qubit IonQ machine. A performance peak is seen at p = 4 for 3 stocks in the wider pool. (b) Two variations of E-QAOA using three stocks and R-QAOA with 2 stocks in the wider pool on Rigetti Aspen-10. By p = 5, E-QAOA-II requires in excess of 1000 one and two qubit gates before transpilation and still produces good portfolios better than a random guess.

We also deployed two variations of more advanced versions of the QAOA (the Quantum Alternating Operator Ansätz) to the QPUs. These benchmarks have not been performed on quantum hardware for any combinatorial optimization problem making these measurements first-of-a-kind. This extension to R-QAOA takes an alternative approach to ensuring the algorithm keeps true to the budget for the number of stocks that can be purchased. In this case, portfolios which exceed or fall short of the budget are strictly forbidden (on an ideal QPU at least, noise does break this rule slightly) in contrast to the R-QAOA where they are allowed but penalized to discourage solutions that violate the budget constraint. Here, we call these variations E-QAOA-I (extended QAOA) and E-QAOA-II.

Each variation takes a slightly different approach which is fully explained in the paper. For both variations, we were able to observe increasing performance with p [Figure 1(b) shows this for 3 tickers from p = 1 to 2 in E-QAOA-I]. Remarkably, the E-QAOA variations produced good portfolios with better than random chance even when over 1000 one and two qubit gates were required. On the Rigetti Aspen-10 QPU, this can be attributed to the native implementation of XY gates upon which our E-QAOA variations rely upon.

Some striking observations can be made from the R-QAOA benchmarks taken on IBM machines. These machines have been given quantum volume figures by IBM so can be used to reliably ascertain whether quantum volume and application specific performance coincide. We found that they do not. Indeed, the machine with the lowest reported quantum volume we used (ibmq_lima, quantum volume of eight) actually performed the best for portfolio optimization within the IBM QPUs! Furthermore, the performance of all machines (not just IBM) was observed to be highly variable. Fluctuations of up to 29% in portfolio quality were observed in identical measurements. This makes it quite clear that alongside any benchmark must also come a measure of variability.

If you are to take anything away from this blog post, it should be that while hopes of quantum supremacy in combinatorial optimization may seem far away, hardware (and algorithms) are measurably improving; even over the results of recent years. Given our observations of the non-alignment of general qubit quality benchmarks like quantum volume with our benchmarks, it is clear that the best way to keep track of progress is to benchmark machines for given applications. Finally, remember that just because a QPU gives you a result once, does not mean it will do so reliably. That is, because the performance of present QPUs fluctuates between different runs, any benchmarks must be accompanied by a variability statistic. 


To learn more about some of the ongoing work in quantum benchmarking check out the following resources: