c++ string处理:子串的查找find(匹配match),附一面试小问题:电话号码前缀匹配

参考资料:

[1] http://www.cplusplus.com/reference/string/string/

class

<string>

std::string

typedef basic_string<char> string;

String class

Strings are objects that represent sequences of characters.(字符序列)

The standard string class provides support for such objects with an interface similar to that of a standard container of bytes, but adding features specifically designed to operate with strings of single-byte characters.(注意是单字节字符,如ASCII这样的字符,而Unicode下的中文一个字符需要2个字节来存储)

即:

string chinese = "你";
string english = "n";
cout << chinese << "'s size()=" << chinese.size() <<chinese<< "'s length()=" << chinese.length() << endl;
cout << english << "'s size()=" << english.size() << english << "'s length()=" << english.length() << endl;
/*你's size()=2你's length()=2
n's size()=1n's length()=1*/

The string class is an instantiation of the basic_string class template that uses char (i.e., bytes) as its character type, with its default char_traits and allocator types (see basic_string for more info on the template).

注意:(string类处理字符的方式是独立与编码方式的,不论是多字节还是变长字节编码,string所有的成员函数以及其迭代器都是按照字节操作的!)

Note that this class handles bytes independently of the encoding used: If used to handle sequences of multi-byte or variable-length characters (such as UTF-8), all members of this class (such as length or size), as well as its iterators, will still operate in terms of bytes (not actual encoded characters).

1.Member types:

member type

definition

value_type

char

traits_type

char_traits<char>

allocator_type

allocator<char>

reference

char&

const_reference

const char&

pointer

char*

const_pointer

const char*

iterator

a random access iterator to char (convertible to const_iterator)

const_iterator

random access iterator to const char

reverse_iterator

reverse_iterator<iterator>

const_reverse_iterator

reverse_iterator<const_iterator>

difference_type

ptrdiff_t

size_type

size_t

2.Member functions:

(constructor)

Construct string object (public member function )

(destructor)

String destructor (public member function )

operator=

String assignment (public member function )

3.Iterators:

begin

Return iterator to beginning (public member function )

end

Return iterator to end (public member function )

rbegin

Return reverse iterator to reverse beginning (public member function )

rend

Return reverse iterator to reverse end (public member function )

cbegin 

Return const_iterator to beginning (public member function )

cend 

Return const_iterator to end (public member function )

crbegin 

Return const_reverse_iterator to reverse beginning (public member function )

crend 

Return const_reverse_iterator to reverse end (public member function )

4.Capacity:

size

Return length of string (public member function )

length

Return length of string (public member function )

max_size

Return maximum size of string (public member function )

resize

Resize string (public member function )

capacity

Return size of allocated storage (public member function )

reserve

Request a change in capacity (public member function )

clear

Clear string (public member function )

empty

Test if string is empty (public member function )

shrink_to_fit 

Shrink to fit (public member function )

5.Element access:

operator[]

Get character of string (public member function )

at

Get character in string (public member function )

back 

Access last character (public member function )

front 

Access first character (public member function )

6.Modifiers:

operator+=

Append to string (public member function )

append

Append to string (public member function )

push_back

Append character to string (public member function )

assign

Assign content to string (public member function )

insert

Insert into string (public member function )

erase

Erase characters from string (public member function )

replace

Replace portion of string (public member function )

swap

Swap string values (public member function )

pop_back 

Delete last character (public member function )

7.String operations:

c_str

Get C string equivalent (public member function )

data

Get string data (public member function )

get_allocator

Get allocator (public member function )

copy

Copy sequence of characters from string (public member function )

find

Find content in string (public member function )

rfind

Find last occurrence of content in string (public member function )

find_first_of

Find character in string (public member function )

find_last_of

Find character in string from the end (public member function )

find_first_not_of

Find absence of character in string (public member function )

find_last_not_of

Find non-matching character in string from the end (public member function )

substr

Generate substring (public member function )

compare

Compare strings (public member function )

8.Non-member function overloads:

operator+

Concatenate strings (function )

relational operators

Relational operators for string (function )

swap

Exchanges the values of two strings (function )

operator>>

Extract string from stream (function )

operator<<

Insert string into stream (function )

getline

Get line from stream into string (function )

9.Member constants:

npos

Maximum value for size_t (public static member constant )

public static member constant

<string>

std::string::npos

static const size_t npos = -1;

Maximum value for size_t

npos is a static member constant value with the greatest possible value for an element of type size_t.

This value, when used as the value for a len (or sublen) parameter in string‘s member functions, means “until the end of the string”.

As a return value, it is usually used to indicate no matches.

This constant is defined with a value of -1, which because size_t is an unsigned integral type, it is the largest possible representable value for this type.

cout << "string::npos=" << string::npos << endl;/*string::npos=4294967295*/

 

参考文献

[2] https://www.cnblogs.com/web100/archive/2012/12/02/cpp-string-find-npos.html

C++中string.find()函数与string::npos

查找字符串A是否包含子串B:

string::size_type pos = strA.find(strB);
if (pos != string::npos)
{
	cout << strB << "在" << strA << "中的下标位置为:" << pos << "下标位置从0计起"<<endl;
}
/*abcd在123abcd中的下标位置为:3下标位置从0计起*/


int idx = strA.find("abc");
if (idx == string::npos)
{
	cout << "在" << strA<<"中没有找到该子串" << endl;
}
else
{
	cout << idx << endl;
}

上述代码中,idx的类型被定义为int,这是错误的,即使定义为 unsigned int 也是错的,它必须定义为 string::size_type。

npos 是这样定义的:

static const size_type npos = -1;

因为 string::size_type (由字符串配置器 allocator 定义) 描述的是 size,故需为无符号整数型别。因为缺省配置器以型别 size_t 作为 size_type,于是 -1 被转换为无符号整数型别,npos 也就成了该型别的最大无符号值。不过实际数值还是取决于型别 size_type 的实际定义。不幸的是这些最大值都不相同。事实上,(unsigned long)-1 和 (unsigned short)-1 不同(前提是两者型别大小不同)。因此,比较式 idx == string::npos 中,如果 idx 的值为-1,由于 idx 和字符串string::npos 型别不同,比较结果可能得到 false。

要想判断 find() 的结果是否为npos,最好的办法是直接比较:

if (str.find("abc") == string::npos) { ... }

    ////find函数返回类型 size_type

typedef typename _Mybase::size_type size_type;

size_type find(const _Elem *_Ptr, size_type _Off = 0) const
{	// look for [_Ptr, <null>) beginning at or after _Off
	_DEBUG_POINTER(_Ptr);
    return (find(_Ptr, _Off, _Traits::length(_Ptr)));
}

string s("1a2b3c4d5e6f7g8h9i1a2b3c4d5e6f7g8ha9i");
string flag;
string::size_type position;

//find 函数 返回jk 在s 中的下标位置 
position = s.find("jk");
if (position != s.npos)  //如果没找到,返回一个特别的标志c++中用npos表示,我这里npos取值是4294967295,
{
	cout << "position is : " << position << endl;
}
else
{
	cout << "Not found the flag" + flag;
}

//find_first_of 函数 返回flag 中任意字符 在s 中第一次出现的下标位置

size_type find_first_of(const _Myt& _Right,
size_type _Off = 0) const _NOEXCEPT
{// look for one of _Right at or after _Off
	return (find_first_of(_Right._Myptr(), _Off, _Right.size()));
}

flag = "c";
position = s.find_first_of(flag);
cout << "s.find_first_of(flag) is : " << position << endl;
/*Not found the flags.find_first_of(flag) is : 5*/


flag = "35";
position = s.find_first_of(flag,2);
cout << "s.find_first_of(flag) is : " << position << endl;
/*s.find_first_of(flag) is : 4*/
//查找s 中flag 出现的所有位置。
/*string s("1a2b3c4d5e6f7g8h9i1a2b3c4d5e6f7g8ha9i");*/
flag = "a";
position = 0;
int i = 1;
while ((position = s.find_first_of(flag, position)) != string::npos)
{
	//position=s.find_first_of(flag,position);
	cout << "position  " << i << " : " << position << endl;
	position++;
	i++;
}
	/*输出;
	position  1 : 1
	position  2 : 19
	position  3 : 34
	*/

    //find_first_not_of()函数查找flag 中与s 第一个不匹配的位置

size_type find_first_not_of(const _Myt& _Right,
size_type _Off = 0) const _NOEXCEPT
{// look for none of _Right at or after _Off
	return (find_first_not_of(_Right._Myptr(), _Off,
	_Right.size()));
}

/*string s("1a2b3c4d5e6f7g8h9i1a2b3c4d5e6f7g8ha9i");*/
//flag = "acb12389efgxyz789";	/*输出:flag.find_first_not_of (s) :11,,,,,也就是说acb12389efgxyz789中的下标11开始的字符xyz789在s中找不到。*/
//flag = "xyz";/*输出:flag.find_first_not_of (s) :0*/

flag = "x1a2byz";/*flag.find_first_not_of (s) :0*/
position = flag.find_first_not_of(s);
cout << "flag.find_first_not_of (s) :" << position << endl;
//反向查找,flag 在s 中最后出现的位置
flag="3";
position=s.rfind (flag);
cout << "s.rfind (flag) :" << position << endl;/*s.rfind (flag) :22*/

C++ string  字符串操作 字串查找(匹配)

 

编程题:

 

/*
电话号码归属地查询
输入T组数据:
每组数据{
N条规则
下面是N行
每行出入两个字符串,中间空格隔开
如:123456xxxxx beijing
    1361xxxxxxx kunming
(其中x可代表任意数字,电话号码是11位)
然后输入M个电话号码:
下面M行,行一个电话号码:
如:12345678910

要求对每组数据输出M行
每行对应之前输入的M个电话号码的归属地,如没有在规则中查找到的则输出”unknown”
}
总共要输出T组数据,每组M个归属地。

*/

/*没有仔细读题,居然把x真的当成字母’x’了,大写的GG呀!
请仔细读题,想好了再写!
*/

// wangyi4.cpp : 定义控制台应用程序的入口点。
//

//

/*没有仔细读题,居然把x真的当成字母'x'了,大写的GG呀!
请仔细读题,想好了再写!
*/

/*
电话号码归属地查询
输入T组数据:
每组数据{
N条规则
下面是N行
每行出入两个字符串,中间空格隔开
如:123456xxxxx beijing
	1361xxxxxxx kunming
(其中x可代表任意数字,电话号码是11位)
然后输入M个电话号码:
下面M行,行一个电话号码:
如:12345678910

要求对每组数据输出M行
每行对应之前输入的M个电话号码的归属地,如没有在规则中查找到的则输出"unknown"
}
总共要输出T组数据,每组M个归属地。

*/

#include "stdafx.h"

#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
using namespace std;

int main()
{
	int T;
	cin >> T;
	for (int i = 0; i < T; i++)
	{
		int N;
		cin >> N;
		vector<string> rules_fix;
		vector<string> rules_pos;
		string rule_fix;
		string rule_pos;
		for (int i = 0; i < N; i++)
		{
			cin >> rule_fix;
			cin >> rule_pos;
			rules_fix.push_back(rule_fix);
			rules_pos.push_back(rule_pos);

		}

		int M;
		vector<string> phoneNums;
		string phoneNum;
		cin >> M;
		for (int i = 0; i < M; i++)
		{
			cin >> phoneNum;
			phoneNums.push_back(phoneNum);
		}


		vector<string> numOfPositions;
		

		for (int j = 0; j < phoneNums.size(); j++)
		{

			phoneNum = phoneNums[j];
			vector<int> theBestMatchs;

			for (int i = 0; i < rules_fix.size(); i++)
			{
				theBestMatchs.push_back(0);
				rule_fix = rules_fix[i];
				int k;

				for (k = 1; k <=11; k++)
				{
					/*substr(0,k)://获得字符串s中 从第0位开始的长度为k的字符串//默认时的长度为从开始位置到尾 

					打开函数,看好英文注释!
					typedef basic_string<_Elem, _Traits, _Alloc> _Myt;

					_Myt substr(size_type _Off = 0, size_type _Count = npos) const
					{	// return [_Off, _Off + _Count) as new string,,注意是左闭右开区间!
					return (_Myt(*this, _Off, _Count, get_allocator()));
					}

					你这样写,就是把substr函数用错了!,应该为substr(0,k+1),或者k从1开始计!
					*/
					if (phoneNum.substr(0, k) == rule_fix.substr(0, k))
					{
						theBestMatchs[i]++;
						continue;
					}
					else//一旦前缀子串不等就退出循环
					{
						break;
					}
				}
			}
			/*求theBestMatchs元素最大值的下标,即要在N条规则中找到最匹配的那一个的下标index*/
			auto maxPosition = max_element(theBestMatchs.begin(), theBestMatchs.end());
			int index = maxPosition - theBestMatchs.begin();

			if (theBestMatchs[index] == 0)
			{
				numOfPositions.push_back("unknown");
			}
			else
			{
				numOfPositions.push_back(rules_pos[index]);
			}
			
		}
		/*输出这M个电话号码的归属地*/
		for (int i = 0; i < numOfPositions.size(); i++)
		{
			cout << numOfPositions[i] << endl;
		}

	}

	system("pause");
	return 0;
}


/*
调试:(错误)
1
2
12345678910 beijing
32165498710 shanghai
1
98765432110
beijing
请按任意键继续. . .


调试2:(正确)
1
2
12345678910 beijing
12365498710 shanghai
1
12345685210(找到最匹配的!)
beijing
请按任意键继续. .

*/

这个例题就是一个简单的字符串前缀字串匹配(查找)的问题。

考察对字符串操作的熟悉程度。熟悉string类提供的各种方法,以及STL的<algorithm>提供的排序、查找等函数。

理清思路了再写代码,写代码注意细节:循环的边界条件,迭代器等的左闭右开区间。程序控制流程的转移等。

 

附:

C++字符串I/O stringstream(istringstream、ostringstream)

文件I/O fstream(ifstream、ofstream)

标准输出cin、cout

有一个.txt文件存name与其若干的电话号码phones

如下:

morgan 2015552368 8625550123
drew 9735550130
lee 6095550132 201550175 8005550000

通过文件i/o于字符串i/o将其读入,(操作见下一篇博文:正则表达式:字符串匹配替换等),然后再写入文件。

// FileIO_sstream.cpp : 定义控制台应用程序的入口点。
//

#include "stdafx.h"
#include<fstream>
#include<sstream>
#include<string>
#include<vector>
using namespace std;


struct PersonInfo{
	string name;
	vector<string> phones;
};

int _tmain(int argc, _TCHAR* argv[])
{
	ifstream inFile("personInfo.txt");
	string line, word;
	vector<PersonInfo> people;
	if (inFile)
	{
		while (getline(inFile, line))/*从一个文件流读入数据而非从标准输入设备(键盘)cin得到数据*/
		{
			PersonInfo perInfo;
			istringstream record(line);
			record >> perInfo.name;
			while (record >> word)
			{
				perInfo.phones.push_back(word);
			}
			people.push_back(perInfo);
		}
	}

	/*当一个fstream对象被销毁时,其close()方法会被自动调用!*/

	inFile.close();
	ofstream outFile;
	ostringstream os;
	outFile.open("personInfoOut.txt");
	if (outFile)
	{
		for (const auto & entry : people)
		{
			ostringstream formatted;
			for (const auto& nums : entry.phones)
			{
				formatted << " " << nums;
			}
			os << entry.name << " " << formatted.str() << endl;
			/*先将要输出的内容写到内存中的一个ostringstream对象os中。*/
		}
		outFile << os.str();/*然后再一次性将os对象中的字符串输出到文件流outFile*/
		/*这些流之间的转换非常灵活!*/
	}
	outFile.close();
	

	return 0;
}

 

 

    原文作者:珞喻小森林
    原文地址: https://blog.csdn.net/m0_37357063/article/details/81560212
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞