0%

链接正则表达式

总是在项目中遇到比较小的问题,以前都没怎么放在心上,今天下班前在和同事讨论,怎么一段字符串中寻找相关的子字符串然后在将其替换掉,(如将链接替换成”link” “www.google.com" -> “link”)。我说我记得以前有处理过,看了自己的代码以后,笑了,😊😊😊😊😊😊
今天来说说链接的正则匹配

NSRegularExpression For Link (2016-04-18)

1
#define TestRegularString @"\n This is the test string:1.www.google.com \n 2.http://www.google.com \n 3.https://www.google.com \n 4.Http://www.google.com.cn \n 5.ftp://www.google.com \n 6.http://127.0.0.1:16823 \n"

就像将上面宏定义的字符串里面的链接找出来 并且匹配替换成”网页链接”
先上我以前处理时的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
- (void)formatStringOne {
NSMutableString *formatedString = [TestRegularString mutableCopy];
static NSRegularExpression *iExpression = nil;
NSError *error = nil;
iExpression = iExpression ?: [NSRegularExpression regularExpressionWithPattern:@"((http[s]{0,1}|ftp)://[a-zA-Z0-9\\.\\-]+\\.([a-zA-Z]{2,4})(:\\d+)?(/[a-zA-Z0-9\\.\\-~!@#$%^&*+?:_/=<>]*)?)|(www.[a-zA-Z0-9\\.\\-]+\\.([a-zA-Z]{2,4})(:\\d+)?(/[a-zA-Z0-9\\.\\-~!@#$%^&*+?:_/=<>]*)?)" options:0 error:&error];
if (error) {
NSLog(@"error:%@",error);
return;
}
NSMutableArray *linkArray = [@[] mutableCopy];
[linkArray addObjectsFromArray:[iExpression matchesInString:formatedString options:0 range:NSMakeRange(0,formatedString.length)]];
while (linkArray.count > 0) {
NSTextCheckingResult *result = [linkArray firstObject];
[formatedString replaceCharactersInRange:result.range withString:@"网页链接"];

[linkArray removeAllObjects];
[linkArray addObjectsFromArray:[iExpression matchesInString:formatedString options:0 range:NSMakeRange(0,formatedString.length)]];
}
NSLog(@"%s formatedString:%@",__func__,formatedString);
}

哎哟,不知道从哪里找了段正则(正则我不是很熟悉。。。。) 功能能够实现,并且只遍历了n次,但是总感觉有点low。当时应该是知道得从第一个开始替换,后遇到了怎么获取替换后下一个位置的难题,然后就用了此方法。

再看从so上面找到的trick method

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
- (void)formatStringTwo {
// http://stackoverflow.com/questions/11379593/use-nsregularexpression-to-replace-each-match-with-a-result-of-method
NSError *error = nil;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
if (error) {
NSLog(@"error:%@",error);
return;
}
NSMutableString *formatedString = [TestRegularString mutableCopy];
NSArray *linkArray = [detector matchesInString:formatedString options:0 range:NSMakeRange(0,formatedString.length)];
for (NSTextCheckingResult *result in [linkArray reverseObjectEnumerator]) {
[formatedString replaceCharactersInRange:result.range withString:@"网页链接"];
}
NSLog(@"%s formatedString:%@",__func__,formatedString);
}

真的很nice,从后面开始替换,我就不用管替换前后的位置了。。。
再来一段利用偏移量处理的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
- (void)formatStringThree {
NSError *error = nil;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
if (error) {
NSLog(@"error:%@",error);
return;
}
NSMutableString *formatedString = [TestRegularString mutableCopy];
NSInteger offset = 0;
for (NSTextCheckingResult *result in [detector matchesInString:formatedString options:0 range:NSMakeRange(0,formatedString.length)]) {
NSRange resultRange = result.range;
resultRange.location += offset;

// NSString *matchString = [detector replacementStringForResult:result inString:formatedString offset:offset template:@""];
NSString *replaceString = @"网页链接";
[formatedString replaceCharactersInRange:resultRange withString:replaceString];
offset += ([replaceString length] - resultRange.length);
}

NSLog(@"%s formatedString:%@",__func__,formatedString);
}

好吧 我们再来看看它们的输出结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2016-04-18 23:40:39.642 RegularExpressionTest[1642:305453] -[ViewController formatStringOne] formatedString:
This is the test string:1.网页链接
2.网页链接
3.网页链接
4.Http://网页链接
5.网页链接
6.http://127.0.0.1:16823
2016-04-18 23:40:39.666 RegularExpressionTest[1642:305453] -[ViewController formatStringTwo] formatedString:
This is the test string:网页链接
2.网页链接
3.网页链接
4.网页链接
5.网页链接
6.网页链接
2016-04-18 23:40:39.668 RegularExpressionTest[1642:305453] -[ViewController formatStringThree] formatedString:
This is the test string:网页链接
2.网页链接
3.网页链接
4.网页链接
5.网页链接
6.网页链接

从上可以看出,我找的正则不是很牛逼,多看Apple Document总没错,Apple已经为了提供了一个检查数据的类 NSDataDetector,我们真没必要自己去写正则匹配,关键是还匹配错误;诶,一想到swift不是开源了,于是到NSRegularExpression.swift到全局搜索了下”link”关键字,想看看苹果的正则匹配是怎么写的,结果没有找到。。。。。。突然想到这么一句话:”知识就像内裤,看不见但是很重要”,对于我等程序猿来说,算法也是一样的。