java – 字符串是关于开关的数字类型,并始终编译为lookupswitch?

以下代码返回给定的String s是否等于任何其他硬编码字符串.该方法使用switch语句来执行此操作:

public class SwitchOnString {
    public static boolean equalsAny(String s) {
        switch (s) {
        case "string 1":
            return true;
        case "string 2":
            return true;
        default:
            return false;
        }
    }
}

根据Java虚拟机规范(JMS)3.10 Compiling Switches

Compilation of switch statements uses the tableswitch and lookupswitch
instructions.

此外

tableswitch and lookupswitch instructions operate only on int data.

我阅读了3.10章,但没有找到提到的String.

间接关闭的唯一一句是:

Other numeric types must be narrowed to type int for use in a switch.

问题1:
此上下文中的String也是数字类型吗?还是我错过了什么?

SwitchOnString类上的javap -c显示:

Compiled from "SwitchOnString.java"
public class playground.SwitchOnString {
  public playground.SwitchOnString();
   ...

  public static boolean equalsAny(java.lang.String);
    Code:
       0: aload_0
       1: dup
       2: astore_1
       3: invokevirtual #16                 // Method java/lang/String.hashCode:()I
       6: lookupswitch  { // 2
            1117855161: 32
            1117855162: 44
               default: 60
          }
   ...

}

显然,hashCode值用作case的int-keys.这可能匹配:

The lookupswitch instruction pairs int keys (the values of the case
labels) …

继续使用tableswitch和lookupswitch JMS说:

The tableswitch instruction is used when the cases of the switch can
be efficiently represented as indices into a table of target offsets. (…)
Where the cases of the switch are sparse, the table representation of
the tableswitch instruction becomes inefficient in terms of space. The
lookupswitch instruction may be used instead.

如果我做到了这一点,那么案例越稀疏,查找切换的可能性就越大.

问题2:
但是看一下字节码:
两个字符串大小是否足够稀疏以编译切换到lookupswitch?或者将String上的每个开关编译为lookupswitch?

最佳答案 规范没有说明如何编译switch语句,这取决于编译器.

在这方面,JVMS语句“其他数字类型必须缩小到int类型才能在交换机中使用”并不是说Java编程语言会进行这样的转换,也不会说String或Enum是数字类型.即long,float和double是数字类型,但是不支持在Java编程语言中将它们与switch语句一起使用.

所以语言规范说支持切换字符串,因此,编译器必须找到一种方法将它们编译为字节码.使用像哈希码这样的不变属性是一种常见的解决方案,但原则上,也可以使用其他属性,如长度或任意字符.

正如“Why switch on String compiles into two switches”和“Java 7 String switch decompiled: unexpected instruction”中所讨论的,当编译切换字符串值时,javac当前在字节码级别上生成两个切换指令(ECJ也生成两条指令,但细节可能不同).

然后,编译器必须选择lookupswitchtableswitch指令.当数字不稀疏时,javac确实使用tableswitch,但仅当语句具有两个以上的case标签时才使用.

所以当我编译以下方法时:

public static char two(String s) {
    switch(s) {
        case "a": return 'a';
        case "b": return 'b';
    }
    return 0;
}

我明白了

public static char two(java.lang.String);
Code:
   0: aload_0
   1: astore_1
   2: iconst_m1
   3: istore_2
   4: aload_1
   5: invokevirtual #9                  // Method java/lang/String.hashCode:()I
   8: lookupswitch  { // 2
                97: 36
                98: 50
           default: 61
      }
  36: aload_1
  37: ldc           #10                 // String a
  39: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  42: ifeq          61
  45: iconst_0
  46: istore_2
  47: goto          61
  50: aload_1
  51: ldc           #12                 // String b
  53: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  56: ifeq          61
  59: iconst_1
  60: istore_2
  61: iload_2
  62: lookupswitch  { // 2
                 0: 88
                 1: 91
           default: 94
      }
  88: bipush        97
  90: ireturn
  91: bipush        98
  93: ireturn
  94: iconst_0
  95: ireturn

但是当我编译时,

public static char three(String s) {
    switch(s) {
        case "a": return 'a';
        case "b": return 'b';
        case "c": return 'c';
    }
    return 0;
}

我明白了

public static char three(java.lang.String);
Code:
   0: aload_0
   1: astore_1
   2: iconst_m1
   3: istore_2
   4: aload_1
   5: invokevirtual #9                  // Method java/lang/String.hashCode:()I
   8: tableswitch   { // 97 to 99
                97: 36
                98: 50
                99: 64
           default: 75
      }
  36: aload_1
  37: ldc           #10                 // String a
  39: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  42: ifeq          75
  45: iconst_0
  46: istore_2
  47: goto          75
  50: aload_1
  51: ldc           #12                 // String b
  53: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  56: ifeq          75
  59: iconst_1
  60: istore_2
  61: goto          75
  64: aload_1
  65: ldc           #13                 // String c
  67: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  70: ifeq          75
  73: iconst_2
  74: istore_2
  75: iload_2
  76: tableswitch   { // 0 to 2
                 0: 104
                 1: 107
                 2: 110
           default: 113
      }
 104: bipush        97
 106: ireturn
 107: bipush        98
 109: ireturn
 110: bipush        99
 112: ireturn
 113: iconst_0
 114: ireturn

目前还不清楚为什么javac做出这个选择.虽然tableswitch与lookupswitch相比具有更高的基本占用空间(一个额外的32位字),但在字节码中它仍然会更短,即使对于两个案例标签场景也是如此.

但是决策的一致性可以用第二个语句显示,它将始终具有相同的值范围,但仅根据标签的数量编译为lookupswitch或tableswitch.因此,当使用真正的稀疏值时:

public static char three(String s) {
    switch(s) {
        case "a": return 'a';
        case "b": return 'b';
        case "": return 0;
    }
    return 0;
}

它编译成

public static char three(java.lang.String);
Code:
   0: aload_0
   1: astore_1
   2: iconst_m1
   3: istore_2
   4: aload_1
   5: invokevirtual #9                  // Method java/lang/String.hashCode:()I
   8: lookupswitch  { // 3
                 0: 72
                97: 44
                98: 58
           default: 83
      }
  44: aload_1
  45: ldc           #10                 // String a
  47: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  50: ifeq          83
  53: iconst_0
  54: istore_2
  55: goto          83
  58: aload_1
  59: ldc           #12                 // String b
  61: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  64: ifeq          83
  67: iconst_1
  68: istore_2
  69: goto          83
  72: aload_1
  73: ldc           #13                 // String
  75: invokevirtual #11                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  78: ifeq          83
  81: iconst_2
  82: istore_2
  83: iload_2
  84: tableswitch   { // 0 to 2
                 0: 112
                 1: 115
                 2: 118
           default: 120
      }
 112: bipush        97
 114: ireturn
 115: bipush        98
 117: ireturn
 118: iconst_0
 119: ireturn
 120: iconst_0
 121: ireturn

使用lookupswitch作为稀疏哈希码,但使用tableswitch作为第二个交换机.

点赞