Acquia Copilot is a conversational AI connected into our product documentation and knowledge base. Ask Copilot about product features, technical details, troubleshooting and how to get started with Acquia products.
This article provides instructions for admin users on how to add, edit, and remove domains.
Add a domain
This section provides instructions for admin users on how to add a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform. The Domain Settings page opens.
Click Add New Domain.
Provide the requested information.
Domain Details section
Provide the following information in the Domain Details section.
Title (required field): Click in the field and type in a title for the domain.
URL (required field): Click in the field and enter the website URL. Keep in mind that website addresses that end in .com and .eu, for example, are two different domains. Each domain has a unique script and needs to be entered as a separate domain in the list.
Note
The scan also includes all pages that are linked to the URL.
CMS: Click to expand the list and select the CMS. If it is not shown, contact the support team to see if it can be added. For advanced help files, visit CMS Integration.
Note
Some CMS providers have implemented scan rate limiting on their users.
If the CMS is not on the list, contact the support team to see if it can be added.
CMS URL: Click in the text field and type in the base URL for the CMS page. This field is required.
Language: Click to expand the list and select the main language for the site. This will determine which dictionary is used to detect spelling and other errors. Words on a page that are written in another language are usually tagged as misspelled words. For instructions on how to repair the HTML code for this Accessibility error, visit the HTML School site (external link).
Scan section
This section provides instructions on how to fill out the Scan section.
Crawl Automatically: Toggle the switch to ON to set up a schedule for recurring scans. Toggle the switch to OFF to scan manually when needed.
Scan Frequency: Click the drop-down arrow and select the scan frequency.
The available options for the scan frequency are:
Weekly
Every 2 weeks
Every 4 weeks
Every 8 weeks
Every 12 weeks
Note
This is an add-on. If this option is not available and you wish to change your scan frequency, contact your sales representative to upgrade the account. The default scan frequency is weekly.
Day: Click the drop-down arrow and select the day of the week for the recurring scan.
Time: Click in the number field and enter the time of day (12-hour clock) and then choose AM for 00:00-11:59 or PM for 12:00 to 23:59.
Connections per minute: Click the drop-down arrow and select Connections per minute. The choices are:
Slow: 15 connections per minute.
Normal (recommended): 60 connections per minute.
Faster: 90 connections per minute.
Very fast (be cautious): 120 connections per minute.
Superfast (advanced users only) 180 connections per minute.
Note
Some CMS providers have implemented scan rate limiting on their users. For more information on why you should consider reducing the connection speed, visit CMS rate limiting on bots and crawlers (FAQ)
The server answer time may be adversely affected if the connections per minute setting is Faster or higher. At these settings, the speed of the scan may not change if the server is slow to respond.
The scan can do a maximum of five scans per IP address simultaneously. When several websites use the same IP, the scan uses the slowest connections per minute setting from the IP address group.
Max
Max scanned pages: Enter the maximum number of pages to do a scan on, as specified on your Acquia contract.
Crawl (Scan) Options
This section provides instructions on how to fill out the Crawl (scan) Options section.
Toggle the switches ON to activate the different scan options.
Scan subdomains: Toggle the switch ON to include subdomains in the scan.
Spelling ignore capitalized words: Toggle the switch ON to ignore words that begin with a capital letter.
Note
Words that the user includes in the dictionary are case-sensitive. To instruct the scan to ignore words that are written in all caps, and bypass the need to, for example, add the word to the dictionary, toggle this switch to ON.
Case sensitive URLs: Toggle the switch ON to include case-sensitive URLs in the scan.
This is set to the optimal positionby default.
Example OFF: http://monsido.com/Foo/Bar is seen as the same link as: http://monsido.com/foo/bar.
Example ON: The same links above will register as two different links.
Note
In the position, the scan sees all URLs as lowercase, regardless of how they are written on the target website. Turn only in rare cases where the target website does not recognize caps and reads all links in lowercase, which results in duplicates in the flagged scan issues. This is an advanced configuration that should normally be left to the default setting.
Remember: When enabled, the scan does not auto-detect the page language. For example, spell checks are always for the language specified in the lang attribute.
Statistics feature options
Some websites - including CP-built sites - have URLs that use a mix of lowercase and uppercase letters, for example https://www.example.com/TestPage
While the crawler saves the URL as all lowercase in our database, the Statistics module provider, Matomo, does not follow that practice and saves it as written by the website. This results in a scenario where the scan cannot map page statistics for the page: https://www.example.com/TestPage to the page in the database because it was saved as: https://www.example.com/testpage.
This option forces Matomo to put all URLs in lowercase so that they match the entries in our database.
Convert all URLs to lower case: Toggle the switch ON to convert all URL addresses to lowercase.
Advanced Domain Options section
Provide the following information in the Advanced Domain Options section.
Path Constraints
Search: Enter a search parameter for matching strings within the Constraint Patterns list.
Constraint pattern: Enter a constraint pattern.
+ Add: Click + to add a new Constraint pattern. An empty row is added to the list.
Note
The window only shows the first five items. With more than five list items, a paginate function begins to sort consecutive list items.
Delete: Click the trashcan icon to delete an item from the list.
Link Excludes
Search: Enter a search parameter for matching strings within the Link excludes list.
Exclude pattern: Enter a pattern to exclude from the scan.
+ Add: Click + to add a new Exclude pattern. An empty row appears in the list.
Delete: Click the trashcan icon on an item in the list to delete it.
Internal URLs
Operator: Click the drop-down arrow to select Contains, Starts with, or Regex.
This section provides information for admin users on the available advanced domain setting options.
Path Constraints
Use Path Constraints to instruct the scan to only process parts of a domain. The scan regards URLs that match the pattern as internal on the site, and URLs that do not match are regarded in the same way as external links.
A Path Constraint can be a word or a regular expression. In most cases, users set up Path Constraints to:
Restrict the scanner to only recognize parts of a site with a pattern such as ^/en This instructs the scan to handle any URL that does not begin with /en, for example, http:/ /foo.com/fr/bar as an external link. For example, the crawler tests the link but does not follow any links on http:/ /foo.com/fr/bar.
Instruct the scan to ignore parts of the site with a pattern. For example !^/fr. This instructs the scan to handle any URL that begins with /fr, for example, http:/ /foo.com/fr/bar as an external link. This means that the scan tests the link but does not follow any links on http:/ /foo.com/fr/bar.
The difference between the two is that in the first case ALL pages under /en are scanned and nothing else. In the second example, all pages EXCEPT /fr are scanned.
Important
When you use Path constraints, make sure that the URL for the domain is set to a page that matches the constraint.
If you do not do this, the scan will only run on one page, since the scanner cannot proceed to any other page than the page it starts on.
For example. with a constraint of "^/en/booking", starting the crawler on http://foo.com will not work. The crawler will request http://foo.com and it will receive the page, and find that no links match http://foo.com/en/booking and the result is that only the first page is scanned.
You can choose to exclude a word or a regular expression. Use Link Excludes to instruct the crawler to completely ignore a link on the pages. Pages that match the pattern are not tested.
Use Link Excludes to:
Filter out print pages with a pattern such as print=true
This instructs the scan to ignore (and not test) any URL with the pattern, for example:
http://foo.com/bar?print=true
Filter out redirected login pages with a pattern such as:
login.aspx?return_url=zyx
This instructs the scan to ignore all URLs with the pattern, for example http://foo.com/bar/login.aspx?return_url=zyx
Tip! If "Scan subdomains" is turned on for the domain, use the § sign in front of the exclude pattern to match URLs that use the full string instead of the relative one.
For example, to exclude the scan for the "blog" subdomain, enter the pattern:
Configure internal URLs with the following options in this section:
Operator: Click the drop-down arrow to select Contains, Starts with, or Regex.
Url: Type a URL in the field.
Delete: Click the trashcan icon to delete the row.
+ Add: Click to add a new Input Selector. An empty row appears in the list.
Login Type section
This section provides instructions on how to fill out the Login Type section.
Click the drop-down arrow and select a login type. The choices are:
None: The site does not require user logins.
Form: The field expands. The expanded input fields specify the CSS selectors that identify the various form fields and buttons. The following input fields open:
Input selector: (Required field) Enter the CSS ID or class selector to search for.
Default Value: Click to hide or show the input text. Click in the dialog box to set a default value for the selector.
Delete: Click the trashcan icon to delete the row.
+ Add: Click to add a new Form. An empty row appears in the list.
Submit selector: (Required field) Click in the dialog box and type in a string (without spaces) to identify the submit button for the form.
Verify URL: (Required field) Click in the dialog box and type in a URL address that the user will redirect to if the form is valid.
Office 365 (BETA): Beta:This field allows information for Office 365 login.
Email: Click in the dialog box and type in a valid Office 365 email address.
Password: Click in the dialog box and type in the Office 35 password for the email account.
Remember me: Turn the switch ON to remember user credentials.
Basic Auth:
Username: Click in the input field and type in a username field identifier.
Password: Click in the input field and type in a password field identifier.
Hosts: Click in the input field and type in the URL to the host website.
Custom Multistep: This option covers login flows that require special integration. Contact the support team for assistance.
Verify URL: Click in the dialog box and type in a URL address for the user to redirect to if the login is valid.
Username: Click in the input field and type in a username field identifier.
Password: Click in the input field and type in a password field identifier.
JS: This is a field to enter JavaScript strings for the scan to identify.
Delete: Click the trash can icon to delete the row.
+ Add: Click to add a new . An empty row appears in the list.
Features section
This section provides instructions on how to complete the Features section.
Accessibility: Click the drop-down arrow and select the accessibility level that is desired or required for the website. Level AA of WCAG 2.1 is the legal requirement to be considered accessible. WCAG 2.2 is newly released and will soon be the standard, taking over from WCAG 2.1.
Important
When you change the accessibility level, the existing Accessibility Score scan information is erased and the new standard starts over from the beginning with the next scan. All of the history is lost. It is advised to activate a new level only when you are ready to make the updates needed, or if it becomes a legal requirement for your location/organization. If you change the scan to Exclude checks that require manual review, you might not see other potential accessibility violations on your website.
Check with your legal department to find out which level is needed for your industry or location to be considered accessible.
Source code excludes: Click the drop-down arrow and select domains to exclude from the scan.
PDF accessibility: Click the drop-down arrow and select the level of accessibility for the PDF pages included in the scan. The choices are WCAG 2.0 and WCAG 2.1.
Domain Groups section
Set up groups to assign specific users to a subset of pages on one or more domains that the scan has been done on. Created groups appear under the domain name.
The Users section shows a list of users with access to the interface. This section provides instructions on how to add permissions to users in the Users section.
The list is presented in table format.
The table column headers are:
User: The Username for the user.
Visible: Tick the box to make the control panel visible to the user.
Send report: Tick the box to send reports to the user.
Admin: Tick the box to give the user admin permissions. Admins can add, edit, and delete.
This section provides information for admin users on how to edit a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform.
Click Action on the same row as the domain to scan. A drop-down list opens.
Select Edit Domain from the drop-down list.
The Edit Domain page opens.
Edit as needed. Refer to the previous section Add a Domain for instructions on how to fill in the fields.
Important
When you change the accessibility level, the existing Accessibility Score scan information is erased and the new standard starts over from the beginning with the next scan. All of the history is lost. It is advised to activate a new level only when you are ready to make the updates needed, or if it becomes a legal requirement for your location or organization. If you change the scan to Exclude checks that require manual review, you might not see other potential accessibility violations on your website.
Save the changes.
Remove a domain
This section provides information for admin users on how to remove a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform.
The Domain Settings page opens.
Click Action on the same row as the domain to scan. A drop-down list opens.
Select Remove Domain.
A dialog box asks for confirmation. "You are about to delete the domain (domain name) please type Delete to confirm this action."
Type the word "Delete" in the text box and then click OK to remove the domain. Click Cancel to close the dialog without removing the domain.
Important
When you delete a domain, all of the related domain groups are also deleted.
The dialog closes and the domain and all related domain groups are gone from the list.
Add, edit, or remove domains
Introduction
This article provides instructions for admin users on how to add, edit, and remove domains.
Add a domain
This section provides instructions for admin users on how to add a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform. The Domain Settings page opens.
Click Add New Domain.
Provide the requested information.
Domain Details section
Provide the following information in the Domain Details section.
Title (required field): Click in the field and type in a title for the domain.
URL (required field): Click in the field and enter the website URL. Keep in mind that website addresses that end in .com and .eu, for example, are two different domains. Each domain has a unique script and needs to be entered as a separate domain in the list.
Note
The scan also includes all pages that are linked to the URL.
CMS: Click to expand the list and select the CMS. If it is not shown, contact the support team to see if it can be added. For advanced help files, visit CMS Integration.
Note
Some CMS providers have implemented scan rate limiting on their users.
If the CMS is not on the list, contact the support team to see if it can be added.
CMS URL: Click in the text field and type in the base URL for the CMS page. This field is required.
Language: Click to expand the list and select the main language for the site. This will determine which dictionary is used to detect spelling and other errors. Words on a page that are written in another language are usually tagged as misspelled words. For instructions on how to repair the HTML code for this Accessibility error, visit the HTML School site (external link).
Scan section
This section provides instructions on how to fill out the Scan section.
Crawl Automatically: Toggle the switch to ON to set up a schedule for recurring scans. Toggle the switch to OFF to scan manually when needed.
Scan Frequency: Click the drop-down arrow and select the scan frequency.
The available options for the scan frequency are:
Weekly
Every 2 weeks
Every 4 weeks
Every 8 weeks
Every 12 weeks
Note
This is an add-on. If this option is not available and you wish to change your scan frequency, contact your sales representative to upgrade the account. The default scan frequency is weekly.
Day: Click the drop-down arrow and select the day of the week for the recurring scan.
Time: Click in the number field and enter the time of day (12-hour clock) and then choose AM for 00:00-11:59 or PM for 12:00 to 23:59.
Connections per minute: Click the drop-down arrow and select Connections per minute. The choices are:
Slow: 15 connections per minute.
Normal (recommended): 60 connections per minute.
Faster: 90 connections per minute.
Very fast (be cautious): 120 connections per minute.
Superfast (advanced users only) 180 connections per minute.
Note
Some CMS providers have implemented scan rate limiting on their users. For more information on why you should consider reducing the connection speed, visit CMS rate limiting on bots and crawlers (FAQ)
The server answer time may be adversely affected if the connections per minute setting is Faster or higher. At these settings, the speed of the scan may not change if the server is slow to respond.
The scan can do a maximum of five scans per IP address simultaneously. When several websites use the same IP, the scan uses the slowest connections per minute setting from the IP address group.
Max
Max scanned pages: Enter the maximum number of pages to do a scan on, as specified on your Acquia contract.
Crawl (Scan) Options
This section provides instructions on how to fill out the Crawl (scan) Options section.
Toggle the switches ON to activate the different scan options.
Scan subdomains: Toggle the switch ON to include subdomains in the scan.
Spelling ignore capitalized words: Toggle the switch ON to ignore words that begin with a capital letter.
Note
Words that the user includes in the dictionary are case-sensitive. To instruct the scan to ignore words that are written in all caps, and bypass the need to, for example, add the word to the dictionary, toggle this switch to ON.
Case sensitive URLs: Toggle the switch ON to include case-sensitive URLs in the scan.
This is set to the optimal positionby default.
Example OFF: http://monsido.com/Foo/Bar is seen as the same link as: http://monsido.com/foo/bar.
Example ON: The same links above will register as two different links.
Note
In the position, the scan sees all URLs as lowercase, regardless of how they are written on the target website. Turn only in rare cases where the target website does not recognize caps and reads all links in lowercase, which results in duplicates in the flagged scan issues. This is an advanced configuration that should normally be left to the default setting.
Remember: When enabled, the scan does not auto-detect the page language. For example, spell checks are always for the language specified in the lang attribute.
Statistics feature options
Some websites - including CP-built sites - have URLs that use a mix of lowercase and uppercase letters, for example https://www.example.com/TestPage
While the crawler saves the URL as all lowercase in our database, the Statistics module provider, Matomo, does not follow that practice and saves it as written by the website. This results in a scenario where the scan cannot map page statistics for the page: https://www.example.com/TestPage to the page in the database because it was saved as: https://www.example.com/testpage.
This option forces Matomo to put all URLs in lowercase so that they match the entries in our database.
Convert all URLs to lower case: Toggle the switch ON to convert all URL addresses to lowercase.
Advanced Domain Options section
Provide the following information in the Advanced Domain Options section.
Path Constraints
Search: Enter a search parameter for matching strings within the Constraint Patterns list.
Constraint pattern: Enter a constraint pattern.
+ Add: Click + to add a new Constraint pattern. An empty row is added to the list.
Note
The window only shows the first five items. With more than five list items, a paginate function begins to sort consecutive list items.
Delete: Click the trashcan icon to delete an item from the list.
Link Excludes
Search: Enter a search parameter for matching strings within the Link excludes list.
Exclude pattern: Enter a pattern to exclude from the scan.
+ Add: Click + to add a new Exclude pattern. An empty row appears in the list.
Delete: Click the trashcan icon on an item in the list to delete it.
Internal URLs
Operator: Click the drop-down arrow to select Contains, Starts with, or Regex.
This section provides information for admin users on the available advanced domain setting options.
Path Constraints
Use Path Constraints to instruct the scan to only process parts of a domain. The scan regards URLs that match the pattern as internal on the site, and URLs that do not match are regarded in the same way as external links.
A Path Constraint can be a word or a regular expression. In most cases, users set up Path Constraints to:
Restrict the scanner to only recognize parts of a site with a pattern such as ^/en This instructs the scan to handle any URL that does not begin with /en, for example, http:/ /foo.com/fr/bar as an external link. For example, the crawler tests the link but does not follow any links on http:/ /foo.com/fr/bar.
Instruct the scan to ignore parts of the site with a pattern. For example !^/fr. This instructs the scan to handle any URL that begins with /fr, for example, http:/ /foo.com/fr/bar as an external link. This means that the scan tests the link but does not follow any links on http:/ /foo.com/fr/bar.
The difference between the two is that in the first case ALL pages under /en are scanned and nothing else. In the second example, all pages EXCEPT /fr are scanned.
Important
When you use Path constraints, make sure that the URL for the domain is set to a page that matches the constraint.
If you do not do this, the scan will only run on one page, since the scanner cannot proceed to any other page than the page it starts on.
For example. with a constraint of "^/en/booking", starting the crawler on http://foo.com will not work. The crawler will request http://foo.com and it will receive the page, and find that no links match http://foo.com/en/booking and the result is that only the first page is scanned.
You can choose to exclude a word or a regular expression. Use Link Excludes to instruct the crawler to completely ignore a link on the pages. Pages that match the pattern are not tested.
Use Link Excludes to:
Filter out print pages with a pattern such as print=true
This instructs the scan to ignore (and not test) any URL with the pattern, for example:
http://foo.com/bar?print=true
Filter out redirected login pages with a pattern such as:
login.aspx?return_url=zyx
This instructs the scan to ignore all URLs with the pattern, for example http://foo.com/bar/login.aspx?return_url=zyx
Tip! If "Scan subdomains" is turned on for the domain, use the § sign in front of the exclude pattern to match URLs that use the full string instead of the relative one.
For example, to exclude the scan for the "blog" subdomain, enter the pattern:
Configure internal URLs with the following options in this section:
Operator: Click the drop-down arrow to select Contains, Starts with, or Regex.
Url: Type a URL in the field.
Delete: Click the trashcan icon to delete the row.
+ Add: Click to add a new Input Selector. An empty row appears in the list.
Login Type section
This section provides instructions on how to fill out the Login Type section.
Click the drop-down arrow and select a login type. The choices are:
None: The site does not require user logins.
Form: The field expands. The expanded input fields specify the CSS selectors that identify the various form fields and buttons. The following input fields open:
Input selector: (Required field) Enter the CSS ID or class selector to search for.
Default Value: Click to hide or show the input text. Click in the dialog box to set a default value for the selector.
Delete: Click the trashcan icon to delete the row.
+ Add: Click to add a new Form. An empty row appears in the list.
Submit selector: (Required field) Click in the dialog box and type in a string (without spaces) to identify the submit button for the form.
Verify URL: (Required field) Click in the dialog box and type in a URL address that the user will redirect to if the form is valid.
Office 365 (BETA): Beta:This field allows information for Office 365 login.
Email: Click in the dialog box and type in a valid Office 365 email address.
Password: Click in the dialog box and type in the Office 35 password for the email account.
Remember me: Turn the switch ON to remember user credentials.
Basic Auth:
Username: Click in the input field and type in a username field identifier.
Password: Click in the input field and type in a password field identifier.
Hosts: Click in the input field and type in the URL to the host website.
Custom Multistep: This option covers login flows that require special integration. Contact the support team for assistance.
Verify URL: Click in the dialog box and type in a URL address for the user to redirect to if the login is valid.
Username: Click in the input field and type in a username field identifier.
Password: Click in the input field and type in a password field identifier.
JS: This is a field to enter JavaScript strings for the scan to identify.
Delete: Click the trash can icon to delete the row.
+ Add: Click to add a new . An empty row appears in the list.
Features section
This section provides instructions on how to complete the Features section.
Accessibility: Click the drop-down arrow and select the accessibility level that is desired or required for the website. Level AA of WCAG 2.1 is the legal requirement to be considered accessible. WCAG 2.2 is newly released and will soon be the standard, taking over from WCAG 2.1.
Important
When you change the accessibility level, the existing Accessibility Score scan information is erased and the new standard starts over from the beginning with the next scan. All of the history is lost. It is advised to activate a new level only when you are ready to make the updates needed, or if it becomes a legal requirement for your location/organization. If you change the scan to Exclude checks that require manual review, you might not see other potential accessibility violations on your website.
Check with your legal department to find out which level is needed for your industry or location to be considered accessible.
Source code excludes: Click the drop-down arrow and select domains to exclude from the scan.
PDF accessibility: Click the drop-down arrow and select the level of accessibility for the PDF pages included in the scan. The choices are WCAG 2.0 and WCAG 2.1.
Domain Groups section
Set up groups to assign specific users to a subset of pages on one or more domains that the scan has been done on. Created groups appear under the domain name.
The Users section shows a list of users with access to the interface. This section provides instructions on how to add permissions to users in the Users section.
The list is presented in table format.
The table column headers are:
User: The Username for the user.
Visible: Tick the box to make the control panel visible to the user.
Send report: Tick the box to send reports to the user.
Admin: Tick the box to give the user admin permissions. Admins can add, edit, and delete.
This section provides information for admin users on how to edit a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform.
Click Action on the same row as the domain to scan. A drop-down list opens.
Select Edit Domain from the drop-down list.
The Edit Domain page opens.
Edit as needed. Refer to the previous section Add a Domain for instructions on how to fill in the fields.
Important
When you change the accessibility level, the existing Accessibility Score scan information is erased and the new standard starts over from the beginning with the next scan. All of the history is lost. It is advised to activate a new level only when you are ready to make the updates needed, or if it becomes a legal requirement for your location or organization. If you change the scan to Exclude checks that require manual review, you might not see other potential accessibility violations on your website.
Save the changes.
Remove a domain
This section provides information for admin users on how to remove a domain.
Click AdminSettings (gear icon) on the top menu bar of any page in the platform.
The Domain Settings page opens.
Click Action on the same row as the domain to scan. A drop-down list opens.
Select Remove Domain.
A dialog box asks for confirmation. "You are about to delete the domain (domain name) please type Delete to confirm this action."
Type the word "Delete" in the text box and then click OK to remove the domain. Click Cancel to close the dialog without removing the domain.
Important
When you delete a domain, all of the related domain groups are also deleted.
The dialog closes and the domain and all related domain groups are gone from the list.
OFF
OFF
The recommended solution is to disable the case-sensitive feature and use only regex.
Render pages and execute JS while crawling: Toggle the switch ON to render pages and execute JavaScript during the scan.
When this option is turned ON, the website renders pages and executes JavaScript during the scan. This allows the scan to take into account any changes caused by Javascript during a website scan.
Note
If the Accessibility feature is enabled, changes caused by JavaScript are automatically accounted for by default.
Mark 403 as broken link: Toggle the switch ON to mark links on pages that return error 403 (forbidden errors) as a broken link.
Ignore canonical URLs: Toggle the switch ON to instruct the scan to ignore canonical URLs.
More info: A canonical URL is the URL of the page that Google determines is most representative, taken from a set of duplicate pages on your website.
Use language attribute: Turn the switch OFF or ON. When it is ON, the scan does not try to auto-detect the page language. This means that the spell check is only done for the language specified in the lang attribute.
The Use Lang Attribute feature allows the scanner to use the correct spell check for HTML pages that have multiple languages.
Note
The primary language of the HTML page must be correctly defined as a lang attribute on the <html> tag. Additionally, each subsection of the page that is in any other language must be wrapped in a tag with a proper lang attribute value. See below for an example:
<html lang=en">
<head>
<title>Hello World in other languages</title>
</head>
<body>
<h1>Hello World in other languages</h1>
<p>
Spanish: <span lang="es">Hola Mundo</span><br>
Danish: <span lang="da">Hej verden</span><br>
French: <span lang="fr">Bonjour le monde</span>
</p>
</body>
</html>
Delete
+ Add: Click to add a new Input Selector. An empty row appears in the list.
Custom Multistep
Readability: Click the drop-down arrow and select a readability standard to determine the level of difficulty of the website text, to ensure that the correct audience is targeted. The choices are Flesch Kincaid and LIX.
Note
The two types of readability tests that we use are Flesch Kincaid for English and LIX for foreign languages.
The Flesch–Kincaid readability test for English text measures word length, number of syllables per word, and sentence length. The formula for the Flesch readability score (FRES) test is as follows:
The LIX method is used for foreign languages and the readability formula is as follows: LIX = A/B + (C x 100)/A, where A = Number of words B = Number of periods (defined by period, colon, or capital first letter) C = Number of long words (More than 6 letters)
Minimum words: Enter the minimum number of words that a page should contain for the readability score to begin calculating. Pages that have fewer words than indicated are not included in the readability score.
Statistics: Toggle ON to turn on this feature and begin to track website visitor statistics.
Data Privacy: Toggle the switch OFF or ON. Click the link to navigate to the Domain Data Privacy Settings page.
The Data Privacy module identifies potential data privacy issues. It can serve as a form of documentation that allows website owners to show their efforts to be compliant with local regulations.
Scan documents: Toggle the switch ON to include documents in the scan.
The scan is a powerful tool that does a scan on all pages of a website. With Scan documents selected, documents on the website are included. The document scan identifies all of the same types of issues as a domain scan.
PageAssist: Toggle the switch ON to activate. Click the link to navigate to the PageAssist Settings page.
PageAssist allows website visitors to make custom visual adjustments to suit their unique requirements.
Consent Manager: Toggle the switch ON to activate.
The Consent Manager module includes everything that you need to classify and set up user cookie acceptance and website cookie banners. This module also includes automatic cookie scan detection and cookie maintenance.
If this content did not answer your questions, try searching or contacting our support team for further assistance.
OFF
OFF
The recommended solution is to disable the case-sensitive feature and use only regex.
Render pages and execute JS while crawling: Toggle the switch ON to render pages and execute JavaScript during the scan.
When this option is turned ON, the website renders pages and executes JavaScript during the scan. This allows the scan to take into account any changes caused by Javascript during a website scan.
Note
If the Accessibility feature is enabled, changes caused by JavaScript are automatically accounted for by default.
Mark 403 as broken link: Toggle the switch ON to mark links on pages that return error 403 (forbidden errors) as a broken link.
Ignore canonical URLs: Toggle the switch ON to instruct the scan to ignore canonical URLs.
More info: A canonical URL is the URL of the page that Google determines is most representative, taken from a set of duplicate pages on your website.
Use language attribute: Turn the switch OFF or ON. When it is ON, the scan does not try to auto-detect the page language. This means that the spell check is only done for the language specified in the lang attribute.
The Use Lang Attribute feature allows the scanner to use the correct spell check for HTML pages that have multiple languages.
Note
The primary language of the HTML page must be correctly defined as a lang attribute on the <html> tag. Additionally, each subsection of the page that is in any other language must be wrapped in a tag with a proper lang attribute value. See below for an example:
<html lang=en">
<head>
<title>Hello World in other languages</title>
</head>
<body>
<h1>Hello World in other languages</h1>
<p>
Spanish: <span lang="es">Hola Mundo</span><br>
Danish: <span lang="da">Hej verden</span><br>
French: <span lang="fr">Bonjour le monde</span>
</p>
</body>
</html>
Delete
+ Add: Click to add a new Input Selector. An empty row appears in the list.
Custom Multistep
Readability: Click the drop-down arrow and select a readability standard to determine the level of difficulty of the website text, to ensure that the correct audience is targeted. The choices are Flesch Kincaid and LIX.
Note
The two types of readability tests that we use are Flesch Kincaid for English and LIX for foreign languages.
The Flesch–Kincaid readability test for English text measures word length, number of syllables per word, and sentence length. The formula for the Flesch readability score (FRES) test is as follows:
The LIX method is used for foreign languages and the readability formula is as follows: LIX = A/B + (C x 100)/A, where A = Number of words B = Number of periods (defined by period, colon, or capital first letter) C = Number of long words (More than 6 letters)
Minimum words: Enter the minimum number of words that a page should contain for the readability score to begin calculating. Pages that have fewer words than indicated are not included in the readability score.
Statistics: Toggle ON to turn on this feature and begin to track website visitor statistics.
Data Privacy: Toggle the switch OFF or ON. Click the link to navigate to the Domain Data Privacy Settings page.
The Data Privacy module identifies potential data privacy issues. It can serve as a form of documentation that allows website owners to show their efforts to be compliant with local regulations.
Scan documents: Toggle the switch ON to include documents in the scan.
The scan is a powerful tool that does a scan on all pages of a website. With Scan documents selected, documents on the website are included. The document scan identifies all of the same types of issues as a domain scan.
PageAssist: Toggle the switch ON to activate. Click the link to navigate to the PageAssist Settings page.
PageAssist allows website visitors to make custom visual adjustments to suit their unique requirements.
Consent Manager: Toggle the switch ON to activate.
The Consent Manager module includes everything that you need to classify and set up user cookie acceptance and website cookie banners. This module also includes automatic cookie scan detection and cookie maintenance.