Being the Best Car Salesman

Car Salesman Problem

How to be a Good Car Salesman?

Let’s say that you put your car on the market and line of willing customers forms to buy your car. Each buyer has a purchase price that they are willing to spend; however, this information is not available to you. When a customer is at the front of the line, you have the option of either 1. accepting the purchase price that the customer offers, or 2. to reject their offer and move on to the next customer. All rejected customers leave the market permanently. If you accept a customer’s offer, then the car is sold to that customer for their purchase price. Now the question is, how should you go about selling the car? Accepting an offer too early may mean missing out on future customers with a higher purchase price, but aggresive rejection may force accepting a lower offer later down the line.

We explore this trade-off through the field of Optimal Stopping Theory.

Optimal Stopping Problems

The theory of optimal stopping involves choosing a time to stop a stochastic process to maximize the reward that is dependent on a given reward function and a realization. Formally, these problems are defined by two things:

The problem is to define a stopping rule in which we maximize the expected profit

ProfitmaxtE[yt({xi}it)].

There are many applications of this problem including applications in optimal control theory, event-triggered systems and switch systems (Karatzas 1984). Other applications include investment strategies and hypothesis testing. Some interesting examples of optimal stopping problems are given below according to (Ferguson 2006).

Maximizing Heads

You observe a fair coin being tossed repeatedly and you can stop the process any time. The question is when should you stop the process to maximize the average number of heads that show up in the sequence? To find this, you need to know the optimal stopping rule and the corresponding guarantees. If the first coin toss is heads, you should stop immediately to get a reward of 1. On the other hand, if the average is ever below 1/2 at a certain point, you can always wait long enough for the average to get back up to 1/2 due to the law of large numbers. In fact, there is a stopping rule that achieves an expected payoff of .79 for the average of heads in the sequence!

One Arm Bandit Problem

Let there be two treatments for a disease, an experimental treatment A, in which the probability of success pA is not known and the standard or established treatment B, where the probability of success pB is known. You have a group of patients that are treated sequentially, and you would like to decide which treatment to give them to maximize the number of patients cured. You can only figure out the pA based on the response by previous patients in the sequence. If it is ever determined that pApB, then you should switch to treatment B for the rest of the patients (similarly in the other direction). Therefore, this problem can be formulated as an optimal stopping problem.

Revisiting the Car Salesman Problem

Let’s revisit the car salesman problem and make it a little more formal. Consider a set of n buyers where each buyer’s purchase price is a random variable Xi for i{1,,n} (We remark that you don’t know what order the customers will line up in). The distributions of each of the random variables is known to you (for example, you have some knowledge on the customer demand), but the exact purchase price realizations are not known to you.

The exact customer selection process is as follows. You begin the process, where i=1 denotes the start of the line. At each step i of the line, you are able to witness the i th’s customer’s purchase price realization xiXi. Again, you are able to either accept xi (the realized purchase price) or go on to the next customer i+1.

To assess the performance of the your decision strategy, we compare expected profit in comparison to the case where you knew all of the realizations xi for all i before hand (and not just the distributions Xi). In the case of full information, the profit is the maximum maxixiXmaxmax(X1,,Xn). It was shown in the seminal paper in (Krengel 1978) that there exists a decision rule that can guarantee the expected payoff of 12E[Xmax]. In fact, no other customer selection strategy can do better, by the following example.

Example.

Let there be two customers with the purchase price X1=1 with full probability and X2=1ε for some probability 0<ε1 and 0 otherwise. A car salesman can either pick customer 1 or customer 2 and get an expected payoff of 1. If you as the car salesman knows the realizations, you can guarantee a payoff of E[Xmax]=P(x1x2)1ε+P(x2x1)1=2ε. Your strategy does not matter in this scenario - if you doesn’t know the purchase price realizations, you are forced to incur the 12 factor penalty.

A Simple Optimal Strategy

Let’s dispense with the suspense and finally discuss the optimal strategy as a car salesman. In fact, the optimal strategy is a fixed price mechanism! This is shown as a prophet inequality in (Correa 2017). In other words, you should set a sticker price, and accept the first customer whose purchase price realization is above that. The optimal sticker price actually corresponds to the median of the distribution Xmax (which you can compute since you know the distributions X1,,Xn ). If you know all the customers have the same purchase price distribution, the efficiency of this fixed price mechanism rises to 74.5%.

Why is this interesting? There are many different market designs, like trading or bartering or auctions. But there is a reason that fixed prices is predominantly used in stores - it is quick, easy, and apparently optimal in a theoretical sense as well! You can thank this result as the reason you don’t have to barter in grocery stores.

We give the proof of the optimality for the fixed price mechanism here. Let E[Xalg] denote the expected profit according to the fixed price algorithm at the median value. Then we have the following set of inequalities. E[Xalg]=Ei[E[Xalg|Xi is first item with value above M]] + E[Xalg|XmaxM]P(Xmax<M) =Ei[E[Xi|Xi is first item with value above M]] =Ei[E[XiM+M|Xi is first item with value above M]] =MP(XmaxM)+Ei[E[XiM|Xi is first item with value above M]] =MP(XmaxM)+i=1nE[XiM|XiM]P(j<iXjM)P(XiM) =MP(XmaxM)+i=1nE[(XiM)+]P(j<iXjM) MP(XmaxM)+i=1nE[(XiM)+]P(XmaxM) MP(XmaxM)+E[(XmaxM)+]P(XmaxM) E[Xmax]/2. Here we use the notation E[(XiM)+]E[max{XiM,0}]. Note that E[(XiM)+]=E[(XiM)+|XiM]P(XiM) + E[(XiM)+|XiM]P(XiM) =E[XiM|XiM]P(XiM).