amazon-web-services – 通过AWS控制台为EC2执行Root Valume Swap

我最近一直在使用Swap根卷方法来创建持久的竞价实例,如
here所述(方法2).通常,我的竞价实例需要2-5分钟才能完成,并且需要完成交换.然而,有些日子,这个过程永远不会结束(或者至少在等待20分钟到一个小时之后我会感到不耐烦!).

要明确的是,实例已创建,但Swap永远不会发生:我可以ssh到服务器但我的持久性文件不存在.我也可以通过访问我的AWS控制台并注意到“spotter”(我的持久存储)没有附件信息来看到这一点:

《amazon-web-services – 通过AWS控制台为EC2执行Root Valume Swap》

由于我使用的交换脚本从未给我任何错误,因此很难看出失败的原因.所以,我想知道如果根据我的屏幕截图,我可以使用AWS EC2管理控制台“手动”执行交换,如果是,我将如何实现此目的.

并且,如果它有助于@Vorsprung,

我通过运行以下脚本来启动该过程:

    # The config file was created in ondemand_to_spot.sh
export config_file=my.conf
cd "$(dirname ${BASH_SOURCE[0]})"

. ../$config_file || exit -1

export request_id=`../ec2spotter-launch $config_file`
echo Spot request ID: $request_id

echo Waiting for spot request to be fulfilled...
aws ec2 wait spot-instance-request-fulfilled --spot-instance-request-ids $request_id

export instance_id=`aws ec2 describe-spot-instance-requests --spot-instance-request-ids $request_id --query="SpotInstanceRequests[*].InstanceId" --output="text"`

echo Waiting for spot instance to start up...
aws ec2 wait instance-running --instance-ids $instance_id

echo Spot instance ID: $instance_id

echo 'Please allow the root volume swap script a few minutes to finish.'
if [ "x$ec2spotter_elastic_ip" = "x" ]
then
        # Non elastic IP
        export ip=`aws ec2 describe-instances --instance-ids $instance_id --filter Name=instance-state-name,Values=running --query "Reservations[*].Instances[*].PublicIpAddress" --output=text`
else
        # Elastic IP
        export ip=`aws ec2 describe-addresses --allocation-ids $ec2spotter_elastic_ip --output text --query 'Addresses[0].PublicIp'`
fi

export name=fast-ai
if [ "$ec2spotter_key_name" = "aws-key-$name" ]
then    function aws-ssh-spot {
        ssh -i ~/.ssh/aws-key-$name.pem ubuntu@$ip
        }
        function aws-terminate-spot {
        aws ec2 terminate-instances --instance-ids $instance_id
        }
        echo  Jupyter Notebook -- $ip:8888
fi

my.conf的位置是:

# Name of root volume.
ec2spotter_volume_name=spotter
# Location (zone) of root volume. If not the same as ec2spotter_launch_zone,
# a copy will be created in ec2spotter_launch_zone.
# Can be left blank, if the same as ec2spotter_launch_zone
ec2spotter_volume_zone=us-west-2b

ec2spotter_launch_zone=us-west-2b
ec2spotter_key_name=aws-key-fast-ai
ec2spotter_instance_type=p2.xlarge
# Some instance types require a subnet to be specified:
ec2spotter_subnet=subnet-c9cba8af

ec2spotter_bid_price=0.55

# uncomment and update the value if you want an Elastic IP
# ec2spotter_elastic_ip=eipalloc-64d5890a

# Security group
ec2spotter_security_group=sg-2be79356

# The AMI to be used as the pre-boot environment. This is NOT your target system installation.
# Do Not Modify this unless you have a need for a different Kernel version from what's supplied.
# ami-6edd3078 is ubuntu-xenial-16.04-amd64-server-20170113
ec2spotter_preboot_image_id=ami-bc508adc

和ec2spotter启动脚本是:

    #!/bin/bash

    # "Phase 1" this is the user-facing script for launching a new spot istance

    if [ "$1" = "" ]; then echo "USER ERROR: please specify a configuration file"; exit -1; fi

    cd $(dirname $0)

    . $1 || exit -1

    # New instance:
    # Desired launch zone
    LAUNCH_ZONE=$ec2spotter_launch_zone
    # Region is LAUNCH_ZONE minus the last character
    LAUNCH_REGION=$(echo $LAUNCH_ZONE | sed -e 's/.$//')
    PUB_KEY=$ec2spotter_key_name

    # Existing Volume:
    # If no volume zone
    if [ "$ec2spotter_volume_zone" = "" ]
    then # Use instance zone
            ec2spotter_volume_zone=$LAUNCH_ZONE
    fi

    # Name of volume (find it by name later)
    ROOT_VOL_NAME=$ec2spotter_volume_name
    # zone of volume (needed if different than instance zone)
    ROOT_ZONE=$ec2spotter_volume_zone
    # Region is Zone minus the last character
    ROOT_REGION=$(echo $ROOT_ZONE | sed -e 's/.$//')


    #echo "ROOT_VOL_NAME=${ROOT_VOL_NAME}; ROOT_ZONE=${ROOT_ZONE}; ROOT_REGION=${ROOT_REGION}; "
    #echo "LAUNCH_ZONE=${LAUNCH_ZONE}; LAUNCH_REGION=${LAUNCH_REGION}; PUB_KEY=${PUB_KEY}"

    AWS_ACCESS_KEY=`aws configure get aws_access_key_id`
    AWS_SECRET_KEY=`aws configure get aws_secret_access_key`

    aws ec2 describe-volumes \
            --filters Name=tag-key,Values="Name" Name=tag-value,Values="$ROOT_VOL_NAME" \
            --region ${ROOT_REGION} --output=json > volumes.tmp || exit -1

    ROOT_VOL=$(jq -r '.Volumes[0].VolumeId' volumes.tmp)
    ROOT_TYPE=$(jq -r '.Volumes[0].VolumeType' volumes.tmp)

    #echo "ROOT_TYPE=$ROOT_TYPE; ROOT_VOL=$ROOT_VOL";
    if [ "$ROOT_VOL_NAME" = "" ]
then
  echo "root volume lacks a Name tag";
  exit -1;
fi

cat >user-data.tmp <<EOF
#!/bin/sh
echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds
echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds

apt-get update
apt-get install -y jq
apt-get install -y python-pip python-setuptools
apt-get install -y git

pip install awscli

cd /root
git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git
echo Got spotter scripts from github.

cd ec2-spotter

echo Swapping root volume
./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip
EOF

userData=$(base64 user-data.tmp | tr -d '\n');

cat >specs.tmp <<EOF
{
  "ImageId" : "$ec2spotter_preboot_image_id",
  "InstanceType": "$ec2spotter_instance_type",
  "KeyName" : "$PUB_KEY",
  "EbsOptimized": true,
  "Placement": {
     "AvailabilityZone": "$LAUNCH_ZONE"
  },
  "BlockDeviceMappings": [
    {
      "DeviceName": "/dev/sda1",
      "Ebs": {
        "DeleteOnTermination": true,
        "VolumeType": "gp2",
        "VolumeSize": 128
      }
    }
  ],
  "NetworkInterfaces": [
      {
        "DeviceIndex": 0,
        "SubnetId": "${ec2spotter_subnet}",
        "Groups": [ "${ec2spotter_security_group}" ],
        "AssociatePublicIpAddress": true
      }
  ],
  "UserData" : "${userData}"
}
EOF

SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION})
echo $SPOT_REQUEST_ID
# Clean up
rm user-data.tmp
rm specs.tmp
rm volumes.tmp

最佳答案 这不是一个确切的答案,但它可以帮助您找到调试问题的方法.

据我了解,这是您的设置的一部分是在ec2spotter启动脚本负责卷交换:

...
cat >specs.tmp <<EOF
{
  "ImageId" : "$ec2spotter_preboot_image_id",
  ...
  "UserData" : "${userData}"
}
EOF

SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION})

specs.tmp用作实例启动规范:–launc-specification file :: //specs.tmp.

启动规范中的“UserData”是一个脚本,它也是在es2spotter-launch中生成的:

cat >user-data.tmp <<EOF
#!/bin/sh
echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds
echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds

apt-get update
...

cd /root
git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git
echo Got spotter scripts from github.

cd ec2-spotter

echo Swapping root volume
./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip
EOF

交换根卷的实际工作由ec2spotter-remount-root脚本执行,该脚本为downloaded from github.

该脚本中有许多echo语句,所以我想如果你找到输出的位置,你就能理解错误了.
因此,当您遇到问题时,您将ssh到实例并检查日志文件.
问题是要检查的文件(以及脚本输出是否正在记录到某个文件中).

以下是我建议尝试的内容:

>检查实例启动时生成的/ var / log下的标准日志(cloud-init.log,syslog等),看看是否可以找到ec2spotter-remount-root输出
>尝试自己启用日志记录,类似于here

我会尝试以这种方式修改es2spotter-launch中的user-data.tmp部分:

#!/bin/bash
set -x
exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1
echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds
...
echo Swapping root volume
./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip
EOF

在这里,我更改了前三行以启用登录/var/log/user-data.log.

>如果1和2不起作用,我会尝试在github上询问脚本作者.由于脚本中有很多回声,作者应该知道在哪里查找输出.

希望有所帮助,您也不需要等待问题出现尝试这一点,而是在成功运行时查找脚本输出.
或者,如果您能够进行少量测试运行,那么请执行此操作并确保您可以使用脚本输出查找日志.

点赞