Подборка решений регулярных выражений PHP

Feb 14, 2019 in PHP, Регулярные выражения

Содержание

Админу на хлеб с маслом - 50 руб.

Проверить домен


$url = "https://contentim.ru/";
if (preg_match('/^(http|https|ftp)://([A-Z0-9][A-Z0-9_-]*(?:.[A-Z0-9][A-Z0-9_-]*)+):?(d+)?/?/i', $url)) {
    echo "Your url is ok.";
} else {
    echo "Wrong url.";
}

Подстветка текста


$text = "Sample sentence from KomunitasWeb, regex has become popular in web programming. Now we learn regex. According to wikipedia, Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor";
$text = preg_replace("/b(regex)b/i", '1', $text);
echo $text;

Подстветить результаты поиска в wordpress

Небольшой хак подсветки искомых слов для блога. Открываем search.php и ищет там строку:

echo $title;

И меняем ее на:


$title  = get_the_title();
$keys= explode(" ",$s);
$title  = preg_replace('/('.implode('|', $keys) .')/iu', '\0', $title);

И добавляем в css


strong.search-excerpt { background: yellow; }

Получаем все картинки со страницы


$url = "https://contentim.ru/";
$images = array();
preg_match_all('/(img|src)=("|')[^"'>]+/i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)("|'|="|=')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
    $info = pathinfo($url);
    if (isset($info['extension']))
    {
        if (($info['extension'] == 'jpg') ||
        ($info['extension'] == 'jpeg') ||
        ($info['extension'] == 'gif') ||
        ($info['extension'] == 'png'))
        array_push($images, $url);
    }
}

Убираем повторяющиеся слова


$text = preg_replace("/s(w+s)1/i", "$1", $text);

Убираем повторяющиеся точки с запятыми


$text = preg_replace("/.+/i", ".", $text);

Проверям тег — XML/HTML


function get_tag( $tag, $xml ) {
  $tag = preg_quote($tag);
  preg_match_all('{<'.$tag.'[^>]*>(.*?).'}',
                   $xml,
                   $matches,
                   PREG_PATTERN_ORDER);
 
  return $matches[1];
}

Ищем совпадения по тегам — XML/HTML


function get_tag( $tag, $xml ) {
  $tag = preg_quote($tag);
  preg_match_all('{<'.$tag.'[^>]*>(.*?).'}',
                   $xml,
                   $matches,
                   PREG_PATTERN_ORDER);
 
  return $matches[1];
}

Ищем совпадения по тегам с установленным атрибутом — XML/HTML


function get_tag( $attr, $value, $xml, $tag=null ) {
  if( is_null($tag) )
    $tag = '\w+';
  else
    $tag = preg_quote($tag);
 
  $attr = preg_quote($attr);
  $value = preg_quote($value);
 
  $tag_regex = "/<(".$tag.")[^>]*$attr\s*=\s*".
                "(['\"])$value\\2[^>]*>(.*?)<\/\\1>/"
 
  preg_match_all($tag_regex,
                 $xml,
                 $matches,
                 PREG_PATTERN_ORDER);
 
  return $matches[3];
}

Проверяем шестнадцатеричное значение


$string = "#555555";
if (preg_match('/^#(?:(?:[a-fd]{3}){1,2})$/i', $string)) {
echo "example 6 successful.";
}

Находим заголовок страницы


$fp = fopen("https://contentim.ru/blog","r");
while (!feof($fp) ){
    $page .= fgets($fp, 4096);
}
 
$titre = eregi("",$page,$regs);
echo $regs[1];
fclose($fp);

Распарсиваем логи апача


//Logs: Apache web server
//Successful hits to HTML files only.  Useful for counting the number of page views.
'^((?#client IP or domain name)S+)s+((?#basic authentication)S+s+S+)s+[((?#date and time)[^]]+)]s+"(?:GET|POST|HEAD) ((?#file)/[^ ?"]+?.html?)??((?#parameters)[^ ?"]+)? HTTP/[0-9.]+"s+(?#status code)200s+((?#bytes transferred)[-0-9]+)s+"((?#referrer)[^"]*)"s+"((?#user agent)[^"]*)"$'
 
//Logs: Apache web server
//404 errors only
'^((?#client IP or domain name)S+)s+((?#basic authentication)S+s+S+)s+[((?#date and time)[^]]+)]s+"(?:GET|POST|HEAD) ((?#file)[^ ?"]+)??((?#parameters)[^ ?"]+)? HTTP/[0-9.]+"s+(?#status code)404s+((?#bytes transferred)[-0-9]+)s+"((?#referrer)[^"]*)"s+"((?#user agent)[^"]*)"$'

Заменяем двойные кавычки на безопасный аналог


preg_replace('B"b([^"x84x93x94rn]+)b"B', '?1?', $text);

Проверка стойкости пароля


'A(?=[-_a-zA-Z0-9]*?[A-Z])(?=[-_a-zA-Z0-9]*?[a-z])(?=[-_a-zA-Z0-9]*?[0-9])[-_a-zA-Z0-9]{6,}z'

Админу на хлеб с маслом - 50 руб.