浅谈多线程和断点续传

2023年2月22日 113次阅读来源: ImLynn

一、在谈多线程之前，就必须要先说下http协议，才能更好的理解多线程，下面简单讲解下http的请求与响应：

1、对于HTTP协议，向服务器请求某个文件时，只要发送类似如下的请求即可：

GET /Path/FileName HTTP/1.0

Host: www.server.com:80

Accept: */*

User-Agent: GeneralDownloadApplication

Connection: close

请求行以一个方法符号开头，以空格分开，后面跟着请求的URI和协议的版本，格式如下：Method Request-URI HTTP-Version CRLF

其中 Method表示请求方法；Request-URI是一个统一资源标识符；HTTP-Version表示请求的HTTP协议版本；CRLF表示回车和换行（除了作为结尾的CRLF外，不允许出现单独的CR或LF字符）。

1.1、请求方法（所有方法全为大写）有多种，各个方法的解释如下：

GET 请求获取Request-URI所标识的资源

POST 在Request-URI所标识的资源后附加新的数据

HEAD 请求获取由Request-URI所标识的资源的响应消息报头

PUT 请求服务器存储一个资源，并用Request-URI作为其标识

DELETE 请求服务器删除Request-URI所标识的资源

TRACE 请求服务器回送收到的请求信息，主要用于测试或诊断

CONNECT 保留将来使用

OPTIONS 请求查询服务器的性能，或者查询与资源相关的选项和需求

1.2、应用举例：

GET方法：在浏览器的地址栏中输入网址的方式访问网页时，浏览器采用GET方法向服务器获取资源，eg:GET /form.html HTTP/1.1 (CRLF)

POST方法要求被请求服务器接受附在请求后面的数据，常用于提交表单。

eg：POST /reg.jsp HTTP/ (CRLF)

Accept:image/gif,image/x-xbit,… (CRLF)

…

HOST:www.guet.edu.cn (CRLF)

Content-Length:22 (CRLF)

Connection:Keep-Alive (CRLF)

Cache-Control:no-cache (CRLF)

(CRLF) //该CRLF表示消息报头已经结束，在此之前为消息报头

user=jeffrey&pwd=1234 //此行以下为提交的数据

HEAD方法与GET方法几乎是一样的，对于HEAD请求的回应部分来说，它的HTTP头部中包含的信息与通过GET请求所得到的信息是相同的。利用这个方法，不必传输整个资源内容，就可以得到Request-URI所标识的资源的信息。该方法常用于测试超链接的有效性，是否可以访问，以及最近是否更新。

2、如果服务器成功收到该请求，并且没有出现任何错误，则会返回类似下面的数据：

HTTP/1.0 200 OK

Content-Length: 13057672

Content-Type: application/octet-stream

Last-Modified: Wed, 10 Oct 2005 00:56:34 GMT

Accept-Ranges: bytes

ETag: “2f38a6cac7cec51:160c”

Server: Microsoft-IIS/6.0

X-Powered-By: ASP.NET

Date: Wed, 16 Nov 2005 01:57:54 GMT

Connection: close

状态行格式如下：HTTP-Version Status-Code Reason-Phrase CRLF

其中，HTTP-Version表示服务器HTTP协议的版本；Status-Code表示服务器发回的响应状态代码；Reason-Phrase表示状态代码的文本描述。

状态代码有三位数字组成，第一个数字定义了响应的类别，且有五种可能取值：

1xx：指示信息–表示请求已接收，继续处理

2xx：成功–表示请求已被成功接收、理解、接受

3xx：重定向–要完成请求必须进行更进一步的操作

4xx：客户端错误–请求有语法错误或请求无法实现

5xx：服务器端错误–服务器未能实现合法的请求

常见状态代码、状态描述、说明：

200 OK //客户端请求成功

400 Bad Request //客户端请求有语法错误，不能被服务器所理解

401 Unauthorized //请求未经授权，这个状态代码必须和WWW-Authenticate报头域一起使用

403 Forbidden //服务器收到请求，但是拒绝提供服务

404 Not Found //请求资源不存在，eg：输入了错误的URL

500 Internal Server Error //服务器发生不可预期的错误

503 Server Unavailable //服务器当前不能处理客户端的请求，一段时间后可能恢复正常

eg：HTTP/1.1 200 OK （CRLF）

更多关于http的协议详解，请参考这篇文章，写得十分不错：

HTTP协议详解

二、多线程下载的基本逻辑：

A、首先在本地创建一个与服务器文件大小相同的临时文件()。

B、计算分配几个线程去下载服务器上的资源，知道每个线程下载文件的起始位置。

那么这个起始位置怎么计算呢？

文件长度/线程个数= 每个线程下载文件的大小。那么

线程1下载的位置：0~每个线程下载文件的大小-1.

线程2：以此类推

那么就是i线程的下载起始位置： (i-1)*每个线程下载文件的大小

C、开启多个线程，每一个线程下载对应位置的文件。

D、如果所有的线程都把自己的数据下载完毕了，服务器上的资源就被下载到本地了。

E、当文件都下载到本地了，那么还有一个文件就是把各个线程下载的文件如何串起来。那么就要利用到一个类：RandomAccessFile 随机文件访问类。

代码如下：

import java.io.InputStream;

import java.io.RandomAccessFile;

import java.net.HttpURLConnection;

import java.net.URL;

public class Demo {

public static int threadCount = 3;

/**

* @param args

public static void main(String[] args) throws Exception {

// 连接服务器，获取文件长度，在本地创建一个大小和服务器一样大的临时文件

String path = “http://192.168.1.100:8080/360.exe”;

URL url = new URL(path);

HttpURLConnection conn = (HttpURLConnection) url.openConnection();

conn.setConnectTimeout(5000);

conn.setRequestMethod(“GET”);

int code = conn.getResponseCode();

if (code == 200) {

// 服务器返回的数据的长度，实际上就是文件的长度

int length = conn.getContentLength();

System.out.println(“文件总长度:” + length);

RandomAccessFile raf = new RandomAccessFile(“setup.exe”, “rwd”);

// 指定创建的文件的长度

raf.setLength(length);

raf.close();

// 在客户端本地

// 假设3个线程去下载资源

// 平均每一个线程下载的文件的大小。

int blockSize = length / threadCount;

for (int threadId = 1; threadId <= threadCount; ++threadId) {

// 第一个线程下载的开始位置

int startIndex = (threadId – 1) * blockSize;

int endIndex = blockSize – 1;

if (threadId == threadCount) {

// 最后一个线程下载的长度稍微长一点

endIndex = length;

}

System.out.println(“线程:” + threadId + “下载:–” + startIndex

+ “–>” + endIndex);

new DownLoadThread(threadId, startIndex, endIndex, path)

.start();

}

} else {

System.out.println(“访问错误”);

}

/**

* 下载文件的子线程，每个线程下载对应的文件

public static class DownLoadThread extends Thread {

private int threadId;

private int startIndex;

private int endIndex;

private String path;

/**

* @param threadId线程ID

* @param startIndex

* @param endIndex

* @param path

* 下载文件在服务器上的路径

public DownLoadThread(int threadId, int startIndex, int endIndex,

String path) {

this.threadId = threadId;

this.startIndex = startIndex;

this.endIndex = endIndex;

this.path = path;

}

@Override

public void run() {

try {

URL url = new URL(path);

HttpURLConnection conn = (HttpURLConnection) url

.openConnection();

conn.setRequestMethod(“GET”);

// 很重要：请求服务器下载部分的文件的指定的位置：

conn.setRequestProperty(“Range”, “bytes=” + startIndex + “-“

+ endIndex);

conn.setConnectTimeout(5000);

int code = conn.getResponseCode();// 从服务器请求全部资源 200ok ,如果请求部分资源

// 206 ok

System.out.println(“code=” + code);

InputStream is = conn.getInputStream();// 返回资源

RandomAccessFile raf = new RandomAccessFile(“setup.exe”, “rwd”);

// 随机写文件的时候从哪个位置开始写

raf.seek(startIndex);// 定位文件

int len = 0;

byte[] buffer = new byte[1024];

while ((len = is.read(buffer)) != -1) {

raf.write(buffer, 0, len);

}

is.close();

raf.close();

System.out.println(“线程” + threadId + “下载完毕”);

} catch (Exception e) {

e.printStackTrace();

}

三、以上就是通过HTTP协议实现文件下载的全过程。但还不能实现断点续传，而实际上断点续传的实现非常简单，只要在请求中加一个Range字段就可以了。

假如一个文件有1000个字节，那么其范围就是0-999，则：

Range: bytes=500- 表示读取该文件的500-999字节，共500字节。

Range: bytes=500-599 表示读取该文件的500-599字节，共100字节。

Range还有其它几种写法，但上面这两种是最常用的，对于断点续传也足矣了。如果HTTP请求中包含Range字段，那么服务器会返回206（Partial Content），同时HTTP头中也会有一个相应的Content-Range字段，类似下面的格式：

Content-Range: bytes 500-999/1000

Content-Range字段说明服务器返回了文件的某个范围及文件的总长度。这时Content-Length字段就不是整个文件的大小了，而是对应文件这个范围的字节数，这一点一定要注意。

最后附上融合多线程和断点续传的代码：

import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import java.io.InputStream;

import java.io.RandomAccessFile;

import java.net.HttpURLConnection;

import java.net.MalformedURLException;

import java.net.URL;

import org.xml.sax.InputSource;

public class TestDownload {

public static final String path = “http://192.168.1.247:8080/youdao.exe”;

public static void main(String[] args) throws Exception {

URL url = new URL(path);

HttpURLConnection conn = (HttpURLConnection) url.openConnection();

conn.setRequestMethod(“GET”);

conn.setConnectTimeout(5000);

conn.setRequestProperty(“User-Agent”,

“Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)”);

int code = conn.getResponseCode();

if (code == 200) {

int len = conn.getContentLength();

RandomAccessFile file = new RandomAccessFile(“/mnt/sdcard/”+getFilenName(path),

“rwd”);

// 1.设置本地文件大小跟服务器的文件大小一致

file.setLength(len);

// 2 .假设开启3 个线程

int threadnumber = 3;

int blocksize = len / threadnumber;

/**

* 线程1 0~ blocksize 线程2 1*bolocksize ~ 2*blocksize 线程3 2*blocksize ~

* 文件末尾

for (int i = 0; i < threadnumber; i++) {

int startposition = i * blocksize;

int endpositon = (i + 1) * blocksize;

if (i == (threadnumber – 1)) {

// 最后一个线程

endpositon = len;

}

DownLoadTask task = new DownLoadTask(i, path, startposition,

endpositon);

task.start();

}

public static String getFilenName(String path) {

int start = path.lastIndexOf(“/”) + 1;

return path.substring(start, path.length());

}

class DownLoadTask extends Thread {

public static final String path = “http://192.168.1.247:8080/youdao.exe”;

int threadid;

String filepath;

int startposition;

int endpositon;

public DownLoadTask(int threadid, String filepath, int startposition,

int endpositon) {

this.threadid = threadid;

this.filepath = filepath;

this.startposition = startposition;

this.endpositon = endpositon;

}

@Override

public void run() {

try {

File postionfile = new File(threadid + “.txt”);

URL url = new URL(filepath);

HttpURLConnection conn = (HttpURLConnection) url.openConnection();

System.out.println(“线程” + threadid + “正在下载 ” + “开始位置 : “

+ startposition + “结束位置 ” + endpositon);

if (postionfile.exists()) {

FileInputStream fis = new FileInputStream(postionfile);

byte[] result = StreamTool.getBytes(fis);

int newstartposition = Integer.parseInt(new String(result));

if (newstartposition > startposition) {

startposition = newstartposition;

}

// “Range”, “bytes=2097152-4194303”)

conn.setRequestProperty(“Range”, “bytes=” + startposition + “-“

+ endpositon);

conn.setRequestMethod(“GET”);

conn.setConnectTimeout(5000);

conn.setRequestProperty(“User-Agent”,

“Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)”);

InputStream is = conn.getInputStream();

RandomAccessFile file = new RandomAccessFile(getFilenName(path),

“rwd”);

// 设置数据从文件哪个位置开始写

file.seek(startposition);

byte[] buffer = new byte[1024];

int len = 0;

// 代表当前读到的服务器数据的位置 ,同时这个值已经存储的文件的位置

int currentPostion = startposition;

// 创建一个文件对象 ,记录当前某个文件的下载位置

while ((len = is.read(buffer)) != -1) {

file.write(buffer, 0, len);

currentPostion += len;

// 需要把currentPostion 信息给持久化到存储设备

String position = currentPostion + “”;

FileOutputStream fos = new FileOutputStream(postionfile);

fos.write(position.getBytes());

fos.flush();

fos.close();

}

file.close();

System.out.println(“线程” + threadid + “下载完毕”);

// 当线程下载完毕后把文件删除掉

if (postionfile.exists()) {

postionfile.delete();

}

} catch (Exception e) {

e.printStackTrace();

}

super.run();

}

public static String getFilenName(String path) {

int start = path.lastIndexOf(“/”) + 1;

return path.substring(start, path.length());

}

欢迎加入IT怪圈。一个我们自己的圈子。

    原文作者：ImLynn
    原文地址: https://www.jianshu.com/p/1bb084d282be
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。