Models¶
PageData
¶
Bases: BaseModel
Representation for data from a webpage
Examples:
>>> from extract_emails.models import PageData
>>> page_data = PageData(website='https://example.com', page_url='https://example.com/page123')
Attributes:
Name | Type | Description |
---|---|---|
website |
str
|
website address from where data |
page_url |
str
|
Page URL from where data |
data |
Optional[Dict[str, List[str]]]
|
Data from the page in format: { 'label': [data, data] }, default: {} |
Source code in extract_emails/models/page_data.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
|
append(label, vals)
¶
Append data from a page to the self.data collection
Examples:
>>> from extract_emails.models import PageData
>>> page_data = PageData(website='https://example.com', page_url='https://example.com/page123')
>>> page_data.append('email', ['email1@email.com', 'email2@email.com'])
>>> page_data.page
>>> {'email': ['email@email.com', 'email2@email.com']}
Parameters:
Name | Type | Description | Default |
---|---|---|---|
label
|
str
|
name of collection, e.g. email, linkedin |
required |
vals
|
list[str]
|
data from a page, e.g. emails, specific URLs etc. |
required |
Source code in extract_emails/models/page_data.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
ato_csv(data, filepath)
async
classmethod
¶
Async save list of PageData
to CSV file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
list[PageData]
|
list of |
required |
filepath
|
Path
|
path to a CSV file |
required |
Source code in extract_emails/models/page_data.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
|
to_csv(data, filepath)
classmethod
¶
Save list of PageData
to CSV file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
list[PageData]
|
list of |
required |
filepath
|
Path
|
path to a CSV file |
required |
Source code in extract_emails/models/page_data.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
|