PHP：正则表达式和特定标签剥离

2023年3月3日 260次阅读

我正在寻找一种剥离所有锚标签的方法,我也希望从’,’到< br>被移除但是< br>应该继续下去.

脏输入：

Abstractor HLTH<br>
Account Representative, Major <a href="#P">P</a><br>
Accountant <a href="#NP">NP</a>, <a href="#M">M</a>, <a href="#REA">REA</a>, <a href="#SKI">SKI</a><br>

应该是这样的：

Abstractor HLTH<br>
Account Representative<br>
Accountant <br>

请帮忙！

–
以下是脏文：

$str = sprintf('

Abstractor HLTH<br>
Account Representative, Major <a href="#P">P</a><br>

Accountant <a href="#NP">NP</a>, <a href="#M">M</a>, <a href="#REA">REA</a>, <a href="#SKI">SKI</a><br>
Accountant, Cost I & II (See Cost Accountant I, II) <a href="#FR">FR</a><br>
Accountant, General <a href="#G">G</a><br>
Accountant, General I (Junior) (See General Accountant) <a href="#FR">FR</a>, <a href="#O/G">O/G</a>, <a href="#W">W</a><br>

Accountant, General II (Intermediate) (See General Accountant) <a href="#FR">FR</a>, <a href="#O/G">O/G</a>, <a href="#W">W</a>, <a href="#HA">HA</a> <br>
Accountant, General III (Senior) (See General Accountant) <a href="#FR">FR</a>, <a href="#O/G">O/G</a>, <a href="#W">W</a> <br>

');

最佳答案通常情况下这是不好用正则表达式来处理HTML字符串,但假设所有的链接都一样,然后使用的preg_replace(形成)应该不会造成问题.试试这个

// Removes all links
$str = preg_replace("/<a href=\"#([A-Z\\/]+?)\">\\1<\\/a>(?:, )?/i", "", $str);

// Strip the comma and everything from the comma
// to the next <br> in the line
$str = preg_replace("/,(.*?)(?=<br>)/i", "", $str);

对于其他建议strip_tags()的答案：它不会删除它剥离的一对HTML标记所包含的文本.例如

Accountant <a href="#NP">NP</a>

变

Accountant NP

这不是OP想要的.