Minerazzi URL Scoring Tool (MUST)
- Other domain intelligence tools:
- FCrDNS Lookups
- IANA Checker
- IP Country Locator
- Status Response Codes
- and many more »
- Minerazzi URL Scoring Tool (MUST) is a powerful redirection checker that reports the actual status code, URL, and IP of a web resource, even in the presence of bogus responses due to 301 and 302 redirections.
- To use the tool, submit up to 500 URLs. Please enter one URL per line, ending each line by pressing the
Enter
key so these are recognized as individual entries. - URLs can be with or without schemes; i.e., http(s). However, depending on DNS configurations, sometimes including/excluding the www. alias can produce dissimilar results.
- You may want to run one web browser instance of the tool per machine to avoid unexpected results.
- If testing a large number of URLs, you may want to do other tasks while the tool is working. Keep in mind that the response time of a remote host is often unknown.
- You may terminate a run at any time by double-clicking the tool reset button and then waiting a few seconds for the current server request to end.
- The tool processes up to 500 URLs at once. Larger input sets will be resized to conform to this limit.
- If the browser stops working during a test, place cursor in the browser address field and press the
Enter
key. Do not press reset button as this will interrupt the session. - The tool computes the initial and final state of a web resource in terms of its URL, IP, and HTTP status code.
- The current version of the tool applies two different tests, T1 and T2 to all final states in order to identify URLs as GOOD (accessible) or BAD (inaccessible, broken). More likely,
- if T1 or T2 or both return a 200/300-level response code, the URL is GOOD. The test not returning said response is actually returning a bogus one. Such responses are often, though not always, returned to deceive web crawlers and automated request tools.
- if T1 and T2 return a 400/500-level response code, or no code, the URL in question is classified as a BAD URL.
- URLs from Old Glory Day
Use this tool to know what happened with those Old Glory URLs belonging to search engines from the 90s. Are they still active or redirecting? If so, to where? Our tool can tell you so. - Useless URLs
Use this tool to identify useless URLs before they reach your database. These are those that point to bogus, nonexistent, misconfigured, blocked, inactive, forbidden, or registration-required pages. For web services that accept URL submissions, these serve no purpose other than occupy database space. - Same-IP URLs
Use this tool to check if a set of URLs share the same IP. This might suggest the use of a shared hosting account or the hosting of different domains on the same server. - Parked and Bogus Domains
Avoiding the indexing of domains that redirect to unwanted, parked, or bogus domains is not an easy task and there is no bullet-proof solution.The following is a strategy that we have used with some degree of success.
- We first test a set of domain names with our MUST tool by submitting them like this
randomstring.domain.tld.
where randomstring is a subdomain consisting of an arbitrary random sequence of characters and tld is the corresponding third-level domain extension; i.e., com, net, org, and so forth. Notice that we append a trailing period. - If the result is a GOOD URL, more likely it belongs to a parked domain undergoing redirection.
- So we retest those that returned BAD URLs by rewriting these now without the trailing period, like this
randomstring.domain.tld
- Next those returning BAD URLs are retested one more time, by removing the randomstring, like this
domain.tld
Thus, those returning GOOD URLs are taken at face value. - For the grand finale, we collect the IPs of all ignored domains for future references. We do this because we know that parked domains from a domain broker often resolve to a common set of IPs. The collected IPs can then be used as a part of a spam filter.
- Our experience is that the above strategy works fairly well, but is not bullet-proof. For instance, domain brokers can change their IPs. Valid domains can be set to redirect to themselves even if they include bogus subdomains, effectively acting as parked. In addition, valid domains can be set to redirect to parked domains through meta-refresh tags or scripts; however, these cases can be elucidated by properly disabling the browser.
- We first test a set of domain names with our MUST tool by submitting them like this
- A Word on CNAME Configurations
Querying a domain name by starting it with www. can direct to nowhere if in the DNS the www alias does not point via the CNAME resource record to the canonical URL. This can also occurs if the www. is required, but omitted in the query. Ideally, users should expect configurations to be set such that a domain name query can be mapped to the intended canonical URL regardless of if they yes-www or no-www.
- Webmasters, developers, and programmers.
- All the data generated by Minerazzi is provided for information purposes only. By using this site, you agree to use our data only for lawful purposes. Unless stated otherwise, the compilation, repackaging, and for/non-for profit dissemination or other use of our data is expressly prohibited without our prior written consent. We reserve the right to modify these terms at any time. By using this site, you agree to abide by these terms.
- AITech Solutions (2015). URL Alias and redirection.
- Wikipedia (2015). URL redirection.
- Mozilla Developer Network (2015) HTTP response codes.
- ServerFault (2015). Is it possible for two hostnames share the same IP address?.
Feedback
Contact us for any suggestion or question regarding this tool.