What is a URL address and how to work with it. Url with a slash or without - why correctly this is? What is URI
Disputes on this issue - how to write a URL correctly, with a slash at the end or without? - Was and be. The argument is diverse, and often contradictory. And the rules for the wrong entry of the universal resource locator (URL) imagine two types. From the search engines - these are supposedly penalties for duplicate pages. From the point of view of performance - supposedly an extra redirect to the right recording page that is automatically generated by the server.
However, when analyzing the technical specifications of the Internet standards, in particular, the document "RFC 1738 - Uniform Resource Locators (URL)", you have to admit that both options for recording the address of the web resource are formally correct, and the sanction for using one or another option is nothing more than Bzik search engine or bike pseudo-SEO-Shnikov.
From the position of laconicity, the option is more correct, an option without a slash at the end, regardless of whether your link "file" is addressed on the server or "folder", indirect proof that will be shown below. But there is not a single statement in the document that the other option is incorrect or referred to another to another resource.
I will not load you by the multi-page translation of the RFC mentioned, since, first of all, the purpose of the question was in the end of the URL, and secondly, the publication is addressed to simple users of the engines, including all the details are not interesting, they are waiting for brief explanations and evidence essentially. Accordingly, I will quote excerpts from this document as an evidence base and explain. To whom it is not interesting, can immediately look at the end of the article.
URL shared syntax
First of all, I will attract attention to exposure from paragraph 2. General URL SYNTAX (URL shared syntax). In each case, I will give a fragment of the text in the original language and the translation into Russian.
URLS Are Used to `Locate" Resources, by providing An Abstract Identification of the Resource Location. URLs are used to "find" resources by providing an abstract designation of the resource location.
That is, the URL itself is a pure abstraction. That it may seem out of us outwardly similar to the file name or folder, does not at all mean a physical indication to just such a file, and not any other in the server file space. Below in the document will be announced directly.
The note In general, in relation to HTTP links, in principle, it is wrong to say that for example
- http://domain.com/path/subpath/filename.txt - Allegedly indicates the file
- http://domain.com/path/subpath/ - Allegedly indicates the folder
- http://domain.com/path - supposedly incorrectly indicates folder
We just got used to so to say, because it is convenient to associate links with files on the site. In fact, all these links indicate some resources, in no way indicate the type of resource. What is hidden behind every resource, that is, what kind of real file or folder and which type of content will be given on this link, then the server configuration is already determined.
It is important to understand that in the references there is no such thing as a "file", "folder", "subfolder", "text", "picture", "HTML", "script", "style table" and so on. No slaum at the end or its absence does not mean anything level, nothing until the link will pass the transformation inside the server, and he will already decide, where does the link actually indicate and what content is what type is hidden behind it. Only this solution refers to the internal server architecture.
Hierarchical schemes
Next, exposure from paragraph 2.3 Hierachical Schemes and Relative Links (hierarchical schemes and relative links).
Some URL Schemes (Such As The FTP, HTTP, And File Schemes) Contain Names That Can Be Considered Hierarchical; The Components of the Hierarchy Are separated by "/". Some URL schemes (such as FTP, HTTP and File) contain names that can be considered hierarchical; Elements of hierarchies are separated by the symbol "/".
That is, it is argued that in certain address schemes, the contents of the resource locator are not allowed to mean hierarchical, and have not yet stipulated that the hierarchy is equivalent to any form, say the file.
Network scheme syntax
Next, exposure from paragraph 3.1. COMMON Internet Scheme Syntax (network circuit shared syntax).
//
The note This, by the way, the answer to the question derived from the considered by us. Often, on this issue argue: how to give a reference to the domain (host) - without a slash at the end or with a slash?
As correct http://domain.com/ or http://domain.com?
And so right. Just the first slant after the host name is intended for separating the name of the path on behalf of the host. The same paragraph of the document reports this as:
URL-PATH THE REST OF THE LOCATOR CONSISTIS OF DATA SPECIFIC TO THE SCHEME, AND IS KNOWN AS THE "URL-PATH". IT Supplies The Details of How The Specified Resource Can Be Accessed. Note That The "/" Between The Host (Or Port) and the Url-Path Is Not Part of the Url-Path. The rest of the locator consists of data characteristic of the schema, and is known as the "URL-PATH" (URL path). She informs details how to access the specified resource. Please note that the "/" symbol between the host (or port) and the URL passage is not part of the URL-PATH.
Neither the word was obliged to put you this closing symbol or not to put when the URL-path is equal to an empty string (as many of us would say when the URL refers to the site root). No one has the right to apply penalties "for two doubles of the main page", because according to the specification, in both cases you refer to the URL to the same resource.
Continue Another excerpt from the same paragraph.
The URL-Path Syntax Depends on the Scheme Being Used, AS Does The Manner in Which it is interpreted. The URL-PATH syntax depends on the scheme used, as well as the way it is interpreted.
This is an excess confirmation that each locator scheme has its own concept of "hierarchy" and the method of its interpretation.
Hierarchy
For Some File Systems, The "/" Used to Denote The Hierarchical Structure of the URL CORRECT A FILENAME WILL LOOK SIMILAR TO THE URL PATH. This Does Not Mean That The Url Is A Unix Filename. The "/" symbol is used to designate the hierarchical structure of the URL, respectively, the separator used in the design of the file name hierarchy, and thus in some file systems the file name looks like a URL path. But this does not mean that the URL is a UNIX-like file name.Despite the fact that this paragraph refers to the FTP scheme, nevertheless, its approval is distributed and other schemes (HTTP, Gopher, Prospero and so on). Only in the FILE scheme, the slash symbol logically denotes the same as in file names, for example file: //Server_or_Device/Path/Subpath/FileName.txt.
Http.
An HTTP URL Takes The Form: http: //
The note It also argues that you can specify a link without a terminal slash. In this case, it was about the situation when the way the link is empty - indicates the root of the host.
Formal entry
And finally exposure from paragraph 5. BNF for Specific URL Schemes (Formal entry for specific URL schemas).
Here in square brackets indicate optional parts. The stars in front of the bracket denotes 0 or more repetitions of such a fragment, as specified in brackets. The vertical line should be understood as or.
Hostport \u003d host [":" port] ... ... httpurl \u003d "http: //" HostPort ["/" HPath ["?" SEARCH]] hpath. \u003d Hsegment * ["/" hsegment] hsegment \u003d * [uchar | ";" | ":" | "@" | "&" | "\u003d"] search \u003d * [uchar | ";" | ":" | "@" | "&" | "\u003d"] ... ... lowalpha \u003d "a" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "z" Hialpha \u003d "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" Alpha \u003d Lowalpha | Hialpha Digit \u003d "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" Safe \u003d "$" | "-" | "_" | "." | "+" Extra \u003d "!" | "*" | "" "|" ("|") "|", "hex \u003d digit |" a "|" b "|" c "|" d "|" E "|" f "|" a "|" b " | "C" | "D" | "E" | "f" escape \u003d "%" hex hex unreserved \u003d alpha | Digit | Safe | Extra Uchar \u003d Unreserved | Escape
Please note how the HPath element is formed by the rules - the link path. Hsegment path elements - segments - separated by a slash. As if hinting at an important idea that the slash divides the path to the hierarchical parts and is always inside. In principle, it is not excluded that the last element of Hsegment can be an empty string (it follows from its definition), and then at the end of the URL involuntarily appears a closing slash.
Output
The division of the path to segments with the symbol of the slasha implies the presence of non-empty names of these segments. Accordingly, the reference with a slash at the end is seen illogical (although not being prohibited) in the sense that it seems to indicate a certain last segment of the path, but this segment does not call any way. Exactly as illogical (but also not being prohibited) link http://domain.com/level1////levelx.noting the interim segments of the path if the way is considered not as a set of parameters, but as a hierarchical structure.
Spatrical language, the semantic filling of two links can be explained as follows:
- - addresses in the default starting point of the second level of the hierarchy
- - Addresses in an indefinite point inside the second level of the hierarchy, that is, however, the task is to be placed on the server that "we appeal to the second level of the hierarchy, and you yourself determine which point you consider at this level default initial."
From all the above should bewhat is similar to how links
- http://domain.com.
- http://domain.com/
address visitor to the root of the site and for example links
- http://domain.com/level1/level2.
- http://domain.com/level1/level2/
address a visitor to the second level of the resource hierarchy. And the fact that a certain server can interpret the layer at the end in its own way and start internally redo the default starting point - say to the index.html file, this is a special case of a specific configuration. In addition, in the implementation of the system of human-understandable URL, all records of redirects using the mod_rewrite server module determine their (inherent in a specific engine) The concept of the url hierarchical structure, in which the elements of the path can equate to the query parameters and not to have a common with the site file structure ( Classic example: http://domain.com/ru/path, element RU is the parameter of the current language, and not a folder on the site).
We emphasize that this is the internal knowledge of the server, due to its configuration, as well as installed on the site engine. External service, let's say the same search engine, the speculation can not do and does not have the concepts, whether they are distinguished and what the links with the slash and without, unless the site server has not specifically configured so that in such links to produce different content.
For your information
At the implementation level, the issue of layers at the ends does not have a fundamental importance, which many confirmations among family portals. On some, all links are completed by a slash, on others - without a slash. The main thing that the content on the links does not turn out to be different, and more for Yandex you need to register the 301th redirect with those links that you do not use (let's say the ending slabs), on those that use. The fact is that according to unconfirmed statements of support for Yandex support, this search engine allegedly can be wrong and not to "glue" (memorize in your knowledge) or with some intake to glue the slash-without-scene addresses in one.
Here is an example of implementing such a redirect using the .htaccess root file:
# If the input URL ends the layer (EM, AMI), # Set the 301th redirect to the page without a flat RewriteCond% (Request_uri) ^ /. + / $ Rewriterule ^ (*?) / + $ http: //% (http_host ) / $ 1
Google (again, according to the information not confirmed by the experiment), these redirects are not important, as it seems to be able to glue such addresses correctly and without redirects.
Remember There are quite a few people who consider themselves SEO specialists. But not each of them is so. Moreover, the topic SEO is often speculated without proper knowledge and grounds, simply in terms of the fact that you are not excluded in this area, so it is easy to believe in any "noodle". When you say that some kind of your page "flew out of the index", take advantage of a very good recommendation of Yandex: to learn about indexing errors, if any, you can in the Yandex.Vebmaster service. In this service, you can always see a list of your pages in the search and list of pages, for some reason excluded from the search. There is a similar service and Google. Trust these knowledge, and not the opinion of pseudo-specialists, who heard something somewhere the edge of the ear, and on the grounds recommend that you do how it seems to be the only right.
Here Very interesting publication of little-known SEO facts, published in April 2017. There is a large study with many screenshots, which began in order to verify the justice of several popular judgments in the field of search promotion and on understandable examples to convey the results to the usual site owner. The same study simultaneously demonstrates a number of obvious, ordinary, and rather inconspicuous, but still amazing features of organic issuance in search of Google and Yandex.
Here Although the next link almost does not concern SEO, it will still be attractive for SEO masters, which are now in finding additional orders. Lodified by a commercial offer, the guys found a curious way to use the site. Private businesses are offered to create an advertising shield online on the basis of some special topic, running the site, and more precisely its first screen looks like a banner stretching on the bilboards of outdoor advertising. On the smartphone turned the screen, stretching became vertical and takes the entire area of \u200b\u200bthe screen, turned back, became horizontal and again to the whole screen. And under the first screen there is a text appendage where users usually do not scribe, but the search engine sees well this text. So, the most chunk mefinos of regional business buy these inexpensive online billboards as a profitable alternative to the contextual advertising and the context-media network of Yandex and Google. And in order to hang out the maximum in the local search index, to promote your shield is ready to steal money immediately to a bunch of SEO texts, which smells like a nonclicat. Judging by the rumors, the orders for 30 kilubles are slipped, and since the guys outline their partners to Ceeshniks, there can be bridges of the partnership and receive a good acquaintance.
: Always wanted to understand it, but his significance was so small that he was always a reason not to do :)
And you wondered: Url - what is it?
Always come across such a thing, but still did not want to understand what the difference between the terms of the URI, URL, URN, and here suddenly the post (unfortunately, he had already gone in the summer), I decided - and I myself read, and I will tell others, though As mentioned above, it will not change anything, but I love sometimes to samp, so read the intelligent transcender:
Have you ever paid attention to the address bar in your browser? What is it? URI, URL or URN? Many of us do not make differences between the URI, the URL, URN, and someone even did not hear the terms of the URI and URN, everything is simply used by the term URL. Let's try to figure it out together.
Decoding Abbrevia
URI - Uniform Resource Identifier (Unified identifier resource)
URL - Uniform Resource Locator (Unified location determinant resource)
URN - UNIFRORM RESOURCE NAME (Unified name resource)
ATTENTION, here in trifles lies truth, but so far nothing is clear, some kind of porridge. Going on.
Definition
URI: Indicates the name and address of the resource on the network. As a rule, it is divided into URL and URN, so the URL and URN are the components of the URI.
URL: address of some resource in the web. The URL determines the location of the resource and the way to appeal to it.
URN: Name of some resource in web. The meaning of the URN is that it determines only the name of a specific subject, which may be in a variety of specific places.
There is nothing better than a specific example
URI \u003d http: //Site/2009/09/uri-URL-RN.html
URL \u003d http: // Site
URN \u003d /2009/09/uri-URL-RN.html.
Let's summarize
URI is the concept of an abstract identifier, while the URL and URN specific implementation - addresses and name.
I hope everything is understandable. Be competent!
The perception of each of us individually, therefore - argue and read discussions in the comments to the article, there is a lot of interesting things.
You can get lost not only in the forest, but also online. And the wrong way or the address leading to the resource can be faithful. You do not know what a URL address is? Then, before going to the further journey through the virtual space, let's understand the system of email addresses.
What is url
The URL is the generally accepted standard for recording the address and guidance on the location of the resource on the Internet. From English his name ( Uniform Resource Locator) It is translated as a single resource pointer. You can find earlier decoding of abbreviation URL - UNIVERSAL RESOURCE LOCATOR (universal resource locator). But both values \u200b\u200bare most likely complemented by the URL concept than to reprove each other.
The main format for recording the structure of the URL address looks like this:
://:@:/?#
- Most often mean the protocol.
login - User login used for authorization on the resource.
password - user password for authorization.
host - domain name host.
the port is the host port used during connection.
The URL is the path where the requested resource is on the server.
parameters and anchor- The value of variables and identifier on a specific resource.
Transmission of variables in the query string is possible only using the GET method.
Consider the format of the URL address of the page of the requested resource in practical examples. On the client side, the URL is displayed in the browser address bar:
Most often there are options:
- http: // ru.wikipedia.org/wiki/Nagal - HTTP is used to send a query ( hypertext transmission protocol);
- https://ru.wikipedia.org/wiki/Nagal_Strica - HTTPS is used as a transmission method. Is a protected form of HTTP protocol using encryption (SSL or TLS);
- fttp: //wikipedia.org/wiki/file.txt - FTTP file transfer protocol;
- http://mail.ru/script.php?num\u003d10&type\u003dnew&v\u003dtext - Transfer of variables in the query string using the GET method.
Any format of the address URL is, first of all, the character string. It may include:
2; Letters.
2; Arabic numbers (0-9).
2; Reserved characters ("+", "\u003d", "!" And others).
2; Special characters - they will dwell in more detail.
Using special characters in URL
Of course, such too "special" characters in the URL are not used. But there are several:
- ? - serves to separate in the block query bar with transmitted parameters;
- & - separates the transmitted parameters from each other;
- \u003d - separated in the parameter variable from its value;
- : - serves to separate the protocol from the rest of the URL;
- # - The symbol is used in the local part of the address. Allows you to refer to a certain part of the requested page;
- @ - indicates user registration data and data transfer using the MailTo protocol.
But all this is only the theory. Therefore, before finding out the rest, consider a small practical example.
Visual example
Take for clarity here is such a simple form of registration:
Here is her code: