动态规划预测游戏输赢的问题总结

2023年8月22日 255次阅读来源: 骑士周游问题

在leetcode中，经常会遇到判断两人游戏，一方是输还是赢的问题。有guess number higher or lower,
can I win,predict the winner等。这类问题都假设双方在最优策略下，甲方是否会赢。
这类问题都可以用动态规划来解决，关键在于采用top-down的备忘录策略，每解决一个小的子问题，都把相应的结果记录在备忘录上，下次遇到相同的问题时，直接查询即可。这样可以把原来O(n!)的复杂度降低到O(2^n)的复杂度。
1) can I win：
In the “100 game,” two players take turns adding, to a running total, any integer from 1..10. The player who first causes the running total to reach or exceed 100 wins.

What if we change the game so that players cannot re-use integers?

For example, two players might take turns drawing from a common pool of numbers of 1..15 without replacement until they reach a total >= 100.

Given an integer maxChoosableInteger and another integer desiredTotal, determine if the first player to move can force a win, assuming both players play optimally.

You can always assume that maxChoosableInteger will not be larger than 20 and desiredTotal will not be larger than 300.
思路：用hash表记录每种可能的选择所对应的结果，这里map

class Solution {
  map<int,bool> m;//用来记录子问题的备忘录
  bool helper(int desiredTotal, int used,int n)
    {
       if(m.count(used)!=0) return m[used];//如果m中已经有结果，直接输出。
       int bit=1;
       if(desiredTotal<=0)//说明上次，即对方已经达到想要的和，输
       {
        m[used]=0;
        return 0;
       }
       for(int i=0;i<n;i++,bit<<=1)
       {
           if((used&bit)==0)//该i未被用
           {
               if(i>=desiredTotal)//能达到和
               {
                   m[used]=1;
                   return true;

               }
               used|=bit;
               bool nextwinner=helper(desiredTotal-i-1,used,n);//对方的输赢
               used-=bit;

               if(!nextwinner)//对方输
               {
                   m[used]=1;
                   return true;
               }
           }
       }
       m[used]=0;
       return 0;
    }
    public:
    bool canIWin(int maxChoosableInteger, int desiredTotal) {
        int n=maxChoosableInteger;
       int sum=(1+maxChoosableInteger)*maxChoosableInteger/2;
       if(sum<desiredTotal) return false;
       if(maxChoosableInteger>=desiredTotal) return true;
       if(desiredTotal==0) return 1;
       return helper(desiredTotal,0,n);
    }
};

2) predict the winner:
Given an array of scores that are non-negative integers. Player 1 picks one of the numbers from either end of the array followed by the player 2 and then player 1 and so on. Each time a player picks a number, that number will not be available for the next player. This continues until all the scores have been chosen. The player with the maximum score wins.

Given an array of scores, predict whether player 1 is the winner. You can assume each player plays to maximize his score.

Example 1:
Input: [1, 5, 2]
Output: False
Explanation: Initially, player 1 can choose between 1 and 2.
If he chooses 2 (or 1), then player 2 can choose from 1 (or 2) and 5. If player 2 chooses 5, then player 1 will be left with 1 (or 2).
So, final score of player 1 is 1 + 2 = 3, and player 2 is 5.
Hence, player 1 will never be the winner and you need to return False.
Example 2:
Input: [1, 5, 233, 7]
Output: True
Explanation: Player 1 first chooses 1. Then player 2 have to choose between 5 and 7. No matter which number player 2 choose, player 1 can choose 233.
Finally, player 1 has more score (234) than player 2 (12), so you need to return True representing player1 can win.
Note:
1 <= length of the array <= 20.
Any scores in the given array are non-negative integers and will not exceed 10,000,000.
If the scores of both players are equal, then player 1 is still the winner.
也是双方游戏,最后和大的获胜.如果一方选择了两端的任意一个数，可以看成加，而另一方选择它的数对于自己来说可以看成是减。只要最后的结果不小于0，说明自己就比对手高。这里也用一个存储记录表来记录当前子问题的结果。对于子数组中的该问题，dp[s][e] 表示对于在数组nums[s,…,e]中问题的解。

class Solution {
public:
    bool PredictTheWinner(vector<int>& nums) {
       int i,n=nums.size();
       vector<vector<int>> dp(n,vector<int>(n,0));//备忘录
       int res=DP(dp,0,n-1,nums);
       return res>=0;
    }
    int DP(vector<vector<int>> &dp, int s, int e,vector<int>& nums)
    {
        if(s==e) return nums[s];
        if(dp[s][e]!=0) return dp[s][e];
        int tmp=max(nums[s]-DP(dp,s+1,e,nums),nums[e]-DP(dp,s,e-1,nums));
        dp[s][e]=tmp;
        return dp[s][e];
    }
};

这里涉及到的双方对弈的游戏，都假设对手所作的决策也是最优的，即minimax算法。但是，运用备忘录的DP算法，而不是递归，可以大大加快速度。

    原文作者：骑士周游问题
    原文地址: https://blog.csdn.net/liuchenjane/article/details/54945078
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。