CustomAnalyzer Class
Allows you to take control over the process of converting text into indexable/searchable tokens. It's a user-defined configuration consisting of a single predefined tokenizer and one or more filters. The tokenizer is responsible for breaking text into tokens, and the filters for modifying tokens emitted by the tokenizer.
Constructor
CustomAnalyzer(*args: Any, **kwargs: Any)
Variables
| Name | Description |
|---|---|
|
name
|
The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters. Required. |
|
tokenizer_name
|
The name of the tokenizer to use to divide continuous text into a sequence of tokens, such as breaking a sentence into words. Required. Known values are: "classic", "edgeNGram", "keyword_v2", "letter", "lowercase", "microsoft_language_tokenizer", "microsoft_language_stemming_tokenizer", "nGram", "path_hierarchy_v2", "pattern", "standard_v2", "uax_url_email", and "whitespace". |
|
token_filters
|
A list of token filters used to filter out or modify the tokens generated by a tokenizer. For example, you can specify a lowercase filter that converts all characters to lowercase. The filters are run in the order in which they are listed. |
|
char_filters
|
A list of character filters used to prepare input text before it is processed by the tokenizer. For instance, they can replace certain characters or symbols. The filters are run in the order in which they are listed. |
|
odata_type
|
A URI fragment specifying the type of analyzer. Required. Default value is "#Microsoft.Azure.Search.CustomAnalyzer". |
Methods
| as_dict |
Return a dict that can be turned into json using json.dump. |
| clear |
Remove all items from D. |
| copy | |
| get |
Get the value for key if key is in the dictionary, else default. :param str key: The key to look up. :param any default: The value to return if key is not in the dictionary. Defaults to None :returns: D[k] if k in D, else d. :rtype: any |
| items | |
| keys | |
| pop |
Removes specified key and return the corresponding value. :param str key: The key to pop. :param any default: The value to return if key is not in the dictionary :returns: The value corresponding to the key. :rtype: any :raises KeyError: If key is not found and default is not given. |
| popitem |
Removes and returns some (key, value) pair :returns: The (key, value) pair. :rtype: tuple :raises KeyError: if D is empty. |
| setdefault |
Same as calling D.get(k, d), and setting D[k]=d if k not found :param str key: The key to look up. :param any default: The value to set if key is not in the dictionary :returns: D[k] if k in D, else d. :rtype: any |
| update |
Updates D from mapping/iterable E and F. :param any args: Either a mapping object or an iterable of key-value pairs. |
| values |
as_dict
Return a dict that can be turned into json using json.dump.
as_dict(*, exclude_readonly: bool = False) -> dict[str, Any]
Keyword-Only Parameters
| Name | Description |
|---|---|
|
exclude_readonly
|
Whether to remove the readonly properties. Default value: False
|
Returns
| Type | Description |
|---|---|
|
A dict JSON compatible object |
clear
Remove all items from D.
clear() -> None
copy
copy() -> Model
get
Get the value for key if key is in the dictionary, else default. :param str key: The key to look up. :param any default: The value to return if key is not in the dictionary. Defaults to None :returns: D[k] if k in D, else d. :rtype: any
get(key: str, default: Any = None) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
Default value: None
|
items
items() -> ItemsView[str, Any]
Returns
| Type | Description |
|---|---|
|
set-like object providing a view on D's items |
keys
keys() -> KeysView[str]
Returns
| Type | Description |
|---|---|
|
a set-like object providing a view on D's keys |
pop
Removes specified key and return the corresponding value. :param str key: The key to pop. :param any default: The value to return if key is not in the dictionary :returns: The value corresponding to the key. :rtype: any :raises KeyError: If key is not found and default is not given.
pop(key: str, default: ~typing.Any = <object object>) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
|
popitem
Removes and returns some (key, value) pair :returns: The (key, value) pair. :rtype: tuple :raises KeyError: if D is empty.
popitem() -> tuple[str, Any]
setdefault
Same as calling D.get(k, d), and setting D[k]=d if k not found :param str key: The key to look up. :param any default: The value to set if key is not in the dictionary :returns: D[k] if k in D, else d. :rtype: any
setdefault(key: str, default: ~typing.Any = <object object>) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
|
update
Updates D from mapping/iterable E and F. :param any args: Either a mapping object or an iterable of key-value pairs.
update(*args: Any, **kwargs: Any) -> None
values
values() -> ValuesView[Any]
Returns
| Type | Description |
|---|---|
|
an object providing a view on D's values |
Attributes
char_filters
A list of character filters used to prepare input text before it is processed by the tokenizer. For instance, they can replace certain characters or symbols. The filters are run in the order in which they are listed.
char_filters: list[typing.Union[str, ForwardRef('_models.CharFilterName')]] | None
name
The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters. Required.
name: str
odata_type
A URI fragment specifying the type of analyzer. Required. Default value is "#Microsoft.Azure.Search.CustomAnalyzer".
odata_type: Literal['#Microsoft.Azure.Search.CustomAnalyzer']
token_filters
A list of token filters used to filter out or modify the tokens generated by a tokenizer. For example, you can specify a lowercase filter that converts all characters to lowercase. The filters are run in the order in which they are listed.
token_filters: list[typing.Union[str, ForwardRef('_models.TokenFilterName')]] | None
tokenizer_name
The name of the tokenizer to use to divide continuous text into a sequence of tokens, such as breaking a sentence into words. Required. Known values are: "classic", "edgeNGram", "keyword_v2", "letter", "lowercase", "microsoft_language_tokenizer", "microsoft_language_stemming_tokenizer", "nGram", "path_hierarchy_v2", "pattern", "standard_v2", "uax_url_email", and "whitespace".
tokenizer_name: str | _models.LexicalTokenizerName