mirror of https://github.com/gorhill/uBlock.git
New cosmetic filter parser using CSSTree library
The new parser no longer uses the browser DOM to validate that a cosmetic filter is valid or not, this is now done through a JS library, CSSTree. This means filter list authors will have to be more careful to ensure that a cosmetic filter is really valid, as there is no more guarantee that a cosmetic filter which works for a given browser/version will still work properly on another browser, or different version of the same browser. This change has become necessary because of many reasons, one of them being the flakiness of the previous parser as exposed by many issues lately: - https://github.com/uBlockOrigin/uBlock-issues/issues/2262 - https://github.com/uBlockOrigin/uBlock-issues/issues/2228 The new parser introduces breaking changes, there was no way to do otherwise. Some current procedural cosmetic filters will be shown as invalid with this change. This occurs because the CSSTree library gets confused with some syntax which was previously allowed by the previous parser because it was more permissive. Mainly the issue is with the arguments passed to some procedural cosmetic filters, and these issues can be solved as follow: Use quotes around the argument. You can use either single or double-quotes, whichever is most convenient. If your argument contains a single quote, use double-quotes, and vice versa. Additionally, try to escape a quote inside an argument using backslash. THis may work, but if not, use quotes around the argument. When the parser encounter quotes around an argument, it will discard them before trying to process the argument, same with escaped quotes inside the argument. Examples: Breakage: ...##^script:has-text(toscr') Fix: ...##^script:has-text(toscr\') Breakage: ...##:xpath(//*[contains(text(),"VPN")]):upward(2) Fix: ...##:xpath('//*[contains(text(),"VPN")]'):upward(2) There are not many filters which break in the default set of filter lists, so this should be workable for default lists. Unfortunately those fixes will break the filter for previous versions of uBO since these to not deal with quoted argument. In such case, it may be necessary to keep the previous filter, which will be discarded as broken on newer version of uBO. THis was a necessary change as the old parser was becoming more and more flaky after being constantly patched for new cases arising, The new parser should be far more robust and stay robist through expanding procedural cosmetic filter syntax. Additionally, in the MV3 version, filters are pre-compiled using a Nodejs script, i.e. outside the browser, so validating cosmetic filters using a live DOM no longer made sense. This new parser will have to be tested throughly before stable release.
This commit is contained in:
parent
fe21ce5002
commit
a71b71e4c8
|
@ -22,6 +22,8 @@
|
|||
|
||||
// For background page or non-background pages
|
||||
|
||||
/* global browser */
|
||||
|
||||
'use strict';
|
||||
|
||||
/******************************************************************************/
|
||||
|
@ -89,6 +91,9 @@ vAPI.webextFlavor = {
|
|||
soup.add('chromium')
|
||||
.add('user_stylesheet');
|
||||
flavor.major = parseInt(match[1], 10) || 0;
|
||||
if ( flavor.major >= 105 ) {
|
||||
soup.add('native_css_has');
|
||||
}
|
||||
}
|
||||
|
||||
// Don't starve potential listeners
|
||||
|
|
|
@ -55,7 +55,7 @@ const outputDir = commandLineArgs.get('output') || '.';
|
|||
const cacheDir = `${outputDir}/../mv3-data`;
|
||||
const rulesetDir = `${outputDir}/rulesets`;
|
||||
const scriptletDir = `${rulesetDir}/js`;
|
||||
const env = [ 'chromium', 'ubol' ];
|
||||
const env = [ 'chromium', 'ubol', 'native_css_has' ];
|
||||
|
||||
/******************************************************************************/
|
||||
|
||||
|
|
|
@ -37,6 +37,7 @@
|
|||
<div class="li"><span><a href="https://github.com/foo123/RegexAnalyzer" target="_blank">Regular Expression Analyzer</a> by <a href="https://github.com/foo123">Nikos M.</a></span></div>
|
||||
<div class="li"><span><a href="https://github.com/hsluv/hsluv" target="_blank">HSLuv - Human-friendly HSL</a> by <a href="https://github.com/boronine">Alexei Boronine</a></span></div>
|
||||
<div class="li"><span><a href="https://searchfox.org/mozilla-central/rev/d317e93d9a59c9e4c06ada85fbff9f6a1ceaaad1/browser/extensions/webcompat/shims/google-ima.js" target="_blank">google-ima.js</a> by <a href="https://www.mozilla.org/">Mozilla</a></span></div>
|
||||
<div class="li"><span><a href="https://github.com/csstree/csstree" target="_blank">CSSTree</a> by <a href="https://github.com/lahmatiy">Roman Dvornov</a></span></div>
|
||||
</div>
|
||||
<div class="li"><span data-i18n="aboutCDNs"></span></div>
|
||||
<div class="liul">
|
||||
|
|
|
@ -95,19 +95,6 @@ class PSelectorMatchesCSSTask extends PSelectorTask {
|
|||
}
|
||||
}
|
||||
}
|
||||
class PSelectorMatchesCSSAfterTask extends PSelectorMatchesCSSTask {
|
||||
constructor(task) {
|
||||
super(task);
|
||||
this.pseudo = '::after';
|
||||
}
|
||||
}
|
||||
|
||||
class PSelectorMatchesCSSBeforeTask extends PSelectorMatchesCSSTask {
|
||||
constructor(task) {
|
||||
super(task);
|
||||
this.pseudo = '::before';
|
||||
}
|
||||
}
|
||||
|
||||
class PSelectorMatchesMediaTask extends PSelectorTask {
|
||||
constructor(task) {
|
||||
|
@ -247,6 +234,20 @@ class PSelectorSpathTask extends PSelectorTask {
|
|||
output.push(node);
|
||||
}
|
||||
}
|
||||
// Helper method for other operators.
|
||||
static qsa(node, selector) {
|
||||
const parent = node.parentElement;
|
||||
if ( parent === null ) { return []; }
|
||||
let pos = 1;
|
||||
for (;;) {
|
||||
node = node.previousElementSibling;
|
||||
if ( node === null ) { break; }
|
||||
pos += 1;
|
||||
}
|
||||
return parent.querySelectorAll(
|
||||
`:scope > :nth-child(${pos})${selector}`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
class PSelectorUpwardTask extends PSelectorTask {
|
||||
|
@ -339,23 +340,20 @@ class PSelector {
|
|||
constructor(o) {
|
||||
if ( PSelector.prototype.operatorToTaskMap === undefined ) {
|
||||
PSelector.prototype.operatorToTaskMap = new Map([
|
||||
[ ':has', PSelectorIfTask ],
|
||||
[ ':has-text', PSelectorHasTextTask ],
|
||||
[ ':if', PSelectorIfTask ],
|
||||
[ ':if-not', PSelectorIfNotTask ],
|
||||
[ ':matches-css', PSelectorMatchesCSSTask ],
|
||||
[ ':matches-css-after', PSelectorMatchesCSSAfterTask ],
|
||||
[ ':matches-css-before', PSelectorMatchesCSSBeforeTask ],
|
||||
[ ':matches-media', PSelectorMatchesMediaTask ],
|
||||
[ ':matches-path', PSelectorMatchesPathTask ],
|
||||
[ ':min-text-length', PSelectorMinTextLengthTask ],
|
||||
[ ':not', PSelectorIfNotTask ],
|
||||
[ ':nth-ancestor', PSelectorUpwardTask ],
|
||||
[ ':others', PSelectorOthersTask ],
|
||||
[ ':spath', PSelectorSpathTask ],
|
||||
[ ':upward', PSelectorUpwardTask ],
|
||||
[ ':watch-attr', PSelectorWatchAttrs ],
|
||||
[ ':xpath', PSelectorXpathTask ],
|
||||
[ 'has', PSelectorIfTask ],
|
||||
[ 'has-text', PSelectorHasTextTask ],
|
||||
[ 'if', PSelectorIfTask ],
|
||||
[ 'if-not', PSelectorIfNotTask ],
|
||||
[ 'matches-css', PSelectorMatchesCSSTask ],
|
||||
[ 'matches-media', PSelectorMatchesMediaTask ],
|
||||
[ 'matches-path', PSelectorMatchesPathTask ],
|
||||
[ 'min-text-length', PSelectorMinTextLengthTask ],
|
||||
[ 'not', PSelectorIfNotTask ],
|
||||
[ 'others', PSelectorOthersTask ],
|
||||
[ 'spath', PSelectorSpathTask ],
|
||||
[ 'upward', PSelectorUpwardTask ],
|
||||
[ 'watch-attr', PSelectorWatchAttrs ],
|
||||
[ 'xpath', PSelectorXpathTask ],
|
||||
]);
|
||||
}
|
||||
this.raw = o.raw;
|
||||
|
@ -374,7 +372,12 @@ class PSelector {
|
|||
prime(input) {
|
||||
const root = input || document;
|
||||
if ( this.selector === '' ) { return [ root ]; }
|
||||
return Array.from(root.querySelectorAll(this.selector));
|
||||
let selector = this.selector;
|
||||
if ( input !== document && /^ [>+~]/.test(this.selector) ) {
|
||||
return Array.from(PSelectorSpathTask.qsa(input, this.selector));
|
||||
}
|
||||
const elems = root.querySelectorAll(selector);
|
||||
return Array.from(elems);
|
||||
}
|
||||
exec(input) {
|
||||
let nodes = this.prime(input);
|
||||
|
@ -453,7 +456,7 @@ class ProceduralFilterer {
|
|||
let style, styleToken;
|
||||
if ( selector.action === undefined ) {
|
||||
style = vAPI.hideStyle;
|
||||
} else if ( selector.action[0] === ':style' ) {
|
||||
} else if ( selector.action[0] === 'style' ) {
|
||||
style = selector.action[1];
|
||||
}
|
||||
if ( style !== undefined ) {
|
||||
|
|
|
@ -429,7 +429,7 @@ FilterContainer.prototype.compileGenericHideSelector = function(
|
|||
// https://github.com/uBlockOrigin/uBlock-issues/issues/131
|
||||
// Support generic procedural filters as per advanced settings.
|
||||
// TODO: prevent double compilation.
|
||||
if ( compiled !== raw ) {
|
||||
if ( compiled.charCodeAt(0) === 0x7B /* '{' */ ) {
|
||||
if ( µb.hiddenSettings.allowGenericProceduralFilters === true ) {
|
||||
return this.compileSpecificSelector(parser, '', false, writer);
|
||||
}
|
||||
|
@ -830,12 +830,12 @@ FilterContainer.prototype.cssRuleFromProcedural = function(json) {
|
|||
let mq;
|
||||
if ( tasks !== undefined ) {
|
||||
if ( tasks.length > 1 ) { return; }
|
||||
if ( tasks[0][0] !== ':matches-media' ) { return; }
|
||||
if ( tasks[0][0] !== 'matches-media' ) { return; }
|
||||
mq = tasks[0][1];
|
||||
}
|
||||
let style;
|
||||
if ( Array.isArray(action) ) {
|
||||
if ( action[0] !== ':style' ) { return; }
|
||||
if ( action[0] !== 'style' ) { return; }
|
||||
style = action[1];
|
||||
}
|
||||
if ( mq === undefined && style === undefined ) { return; }
|
||||
|
|
|
@ -834,7 +834,10 @@ const startPicker = function() {
|
|||
$id('candidateFilters').addEventListener('click', onCandidateClicked);
|
||||
$stor('#resultsetDepth input').addEventListener('input', onDepthChanged);
|
||||
$stor('#resultsetSpecificity input').addEventListener('input', onSpecificityChanged);
|
||||
staticFilteringParser = new StaticFilteringParser({ interactive: true });
|
||||
staticFilteringParser = new StaticFilteringParser({
|
||||
interactive: true,
|
||||
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
|
||||
});
|
||||
};
|
||||
|
||||
/******************************************************************************/
|
||||
|
|
|
@ -1721,7 +1721,9 @@ const getURLFilteringData = function(details) {
|
|||
};
|
||||
|
||||
const compileTemporaryException = function(filter) {
|
||||
const parser = new StaticFilteringParser();
|
||||
const parser = new StaticFilteringParser({
|
||||
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
|
||||
});
|
||||
parser.analyze(filter);
|
||||
if ( parser.shouldDiscard() ) { return; }
|
||||
return staticExtFilteringEngine.compileTemporary(parser);
|
||||
|
|
|
@ -134,7 +134,9 @@ const fromNetFilter = async function(rawFilter) {
|
|||
if ( typeof rawFilter !== 'string' || rawFilter === '' ) { return; }
|
||||
|
||||
const writer = new CompiledListWriter();
|
||||
const parser = new StaticFilteringParser();
|
||||
const parser = new StaticFilteringParser({
|
||||
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
|
||||
});
|
||||
parser.setMaxTokenLength(staticNetFilteringEngine.MAX_TOKEN_LENGTH);
|
||||
parser.analyze(rawFilter);
|
||||
|
||||
|
|
|
@ -97,6 +97,7 @@ function addExtendedToDNR(context, parser) {
|
|||
if ( bad ) { continue; }
|
||||
if ( hn.endsWith('.*') ) { continue; }
|
||||
const { compiled, exception } = parser.result;
|
||||
if ( typeof compiled !== 'string' ) { continue; }
|
||||
if ( compiled.startsWith('{') ) { continue; }
|
||||
if ( exception ) { continue; }
|
||||
let details = context.cosmeticFilters.get(compiled);
|
||||
|
@ -126,14 +127,14 @@ function addExtendedToDNR(context, parser) {
|
|||
/******************************************************************************/
|
||||
|
||||
function addToDNR(context, list) {
|
||||
const env = context.env || [];
|
||||
const writer = new CompiledListWriter();
|
||||
const lineIter = new LineIterator(
|
||||
StaticFilteringParser.utils.preparser.prune(
|
||||
list.text,
|
||||
context.env || []
|
||||
)
|
||||
StaticFilteringParser.utils.preparser.prune(list.text, env)
|
||||
);
|
||||
const parser = new StaticFilteringParser();
|
||||
const parser = new StaticFilteringParser({
|
||||
nativeCssHas: env.includes('native_css_has'),
|
||||
});
|
||||
const compiler = staticNetFilteringEngine.createCompiler(parser);
|
||||
|
||||
writer.properties.set('name', list.name);
|
||||
|
@ -180,10 +181,9 @@ function addToDNR(context, list) {
|
|||
/******************************************************************************/
|
||||
|
||||
async function dnrRulesetFromRawLists(lists, options = {}) {
|
||||
const context = {};
|
||||
const context = Object.assign({}, options);
|
||||
staticNetFilteringEngine.dnrFromCompiled('begin', context);
|
||||
context.extensionPaths = new Map(options.extensionPaths || []);
|
||||
context.env = options.env;
|
||||
context.extensionPaths = new Map(context.extensionPaths || []);
|
||||
const toLoad = [];
|
||||
const toDNR = (context, list) => addToDNR(context, list);
|
||||
for ( const list of lists ) {
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -19,7 +19,7 @@
|
|||
Home: https://github.com/gorhill/uBlock
|
||||
*/
|
||||
|
||||
/* globals WebAssembly */
|
||||
/* globals browser, WebAssembly */
|
||||
|
||||
'use strict';
|
||||
|
||||
|
@ -973,7 +973,10 @@ self.addEventListener('hiddenSettingsChanged', ( ) => {
|
|||
// Useful references:
|
||||
// https://adblockplus.org/en/filter-cheatsheet
|
||||
// https://adblockplus.org/en/filters
|
||||
const parser = new StaticFilteringParser({ expertMode });
|
||||
const parser = new StaticFilteringParser({
|
||||
expertMode,
|
||||
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
|
||||
});
|
||||
const compiler = staticNetFilteringEngine.createCompiler(parser);
|
||||
const lineIter = new LineIterator(
|
||||
parser.utils.preparser.prune(rawText, vAPI.webextFlavor.env)
|
||||
|
|
|
@ -0,0 +1,19 @@
|
|||
Copyright (C) 2016-2022 by Roman Dvornov
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
File diff suppressed because one or more lines are too long
|
@ -23,6 +23,7 @@ cp src/js/uri-utils.js $DES/js
|
|||
cp src/js/url-net-filtering.js $DES/js
|
||||
|
||||
mkdir -p $DES/lib
|
||||
cp -R src/lib/csstree $DES/lib/
|
||||
cp -R src/lib/punycode.js $DES/lib/
|
||||
cp -R src/lib/regexanalyzer $DES/lib/
|
||||
cp -R src/lib/publicsuffixlist $DES/lib/
|
||||
|
|
Loading…
Reference in New Issue