b833a8d255
- Update profile before fetching scrape url - Honor $parms['hide'] before validating dfrn site - Fetch maintenance items oldest first - Add backlog size in log - Add pid to logger for easier threaded cron debug |
||
---|---|---|
assets | ||
images | ||
include | ||
js | ||
library | ||
mod | ||
src | ||
tests | ||
util | ||
view | ||
.gitignore | ||
.htaccess | ||
boot.php | ||
composer.json | ||
dfrndir.sql | ||
example.php | ||
htconfig.php | ||
index.php | ||
maintenance.html | ||
Makefile | ||
README.md | ||
Vagrantfile |
Decentralized Friendica Directory
Installing
1. Initialize the database
Create a database with a username and a password.
Then import dfrndir.sql
to it.
2. Create an autoloader with composer
Make sure you have composer installed globally, or rewrite the command to use a .phar
.
composer dump-autoload
3. Set up the cronjobs.
Example cronjob using www-data
user.
*/30 * * * * www-data cd /var/www/friendica-directory; php include/cron_maintain.php
*/5 * * * * www-data cd /var/www/friendica-directory; php include/cron_sync.php
How syncing works
The new syncing features include: pushing and pulling.
Pushing
Submissions you receive can be submitted to other directories using a push target.
You do this by creating an entry in the sync-targets table with the push bit set to 1
.
Also, you must enable pushing in your .htconfig
settings.
The next time include/cron_sync.php
is run from your cronjob, the queued items will be submitted to your push targets.
Pulling
For pulling to work, the target server must enable pulling.
This makes the /sync/pull/all
and /sync/pull/since/[when]
methods work on that server.
Next you can add an entry in the sync-targets table with the pull bit set to 1
.
Also, you must enable pulling in your .htconfig
settings.
The next time include/cron_sync.php
is run from your cronjob, the pulling sources will be checked.
New items will be queued in your pull queue.
The queue will be gradually cleared based on your syncing.max_pull_items
settings.
You can check the backlog of this queue at the /admin
page.
How submissions are processed
-
The /submit endpoint takes a
?url=
parameter. This parameter is an encoded URL, the original ASCII is treated as binary and base16 encoded. This URL should be a profile location, such ashttps://fc.oscp.info/profile/admin
. This URL will be checked in the database for existing accounts. This check includes a normalization, http vs https is ignored as well as www. prefixes. -
If noscrape is supported by the site, this will be used instead of a scrape request. In this case
https://fc.oscp.info/noscrape/admin
. If noscrape fails or is not supported, the url provided (as is) will be scraped for meta information.<meta name="dfrn-global-visibility" content="true" />
<meta name="friendica.community" content="true" />
or<meta name="friendika.community" content="true" />
<meta name="keywords" content="these,are,your,public,tags" />
<link rel="dfrn-*" href="https://fc.oscp.info/*" />
any dfrn-* prefixed link and it's href attribute..vcard .fn
asfn
.vcard .title
aspdesc
.vcard .photo
asphoto
.vcard .key
askey
.vcard .locality
aslocality
.vcard .region
asregion
.vcard .postal-code
aspostal-code
.vcard .country-name
ascountry-name
-
If the
dfrn-global-visibility
value is set to false. Any existing records will be deleted. And the process exits here. -
A submission is IGNORED when at least the following data could not be scraped.
key
the public key from the hCard.dfrn-request
required for the DFRN protocol.dfrn-confirm
required for the DFRN protocol.dfrn-notify
required for the DFRN protocol.dfrn-poll
required for the DFRN protocol.
-
If the profile existed in the database and the profile is not explicitly set to public using the
dfrn-global-visibility
meta tag. It will be deleted. -
If the profile existed in the database and the profile lacks either an
fn
orphoto
attribute, it will be deleted. -
The profile is now inserted/updated based on the found information. Notable database fields are:
homepage
the originally (decoded)?url=
parameter.nurl
the normalized URL created to remove http vs https and www vs non-www urls.created
the creation date and time in UTC (now if the entry did not exist yet).updated
the current date and time in UTC.
-
If an insert has occurred, the URL will now be used to check for duplicates. The highest insert ID will be kept, anything else deleted.
-
If provided, your public tags are now split by
-
The
photo
provided will be downloaded and resized to 80x80, regardless of source size. -
Should there somehow have been an error at this point such as that there is no profile ID known. Everything will get deleted based on the original
?url=
parameter.
Note about search
The Directory uses MySQL fulltext capabilities to index profiles and offer a search feature.
However, the default minimum word size MySQL will index is 4, which ignores words like PHP
and USA
.
To index words smaller than 4 characters, you will have to edit your my.cnf/my.ini file to include this:
[mysqld]
ft_min_word_len = 3
Then restart your MySQL server.
If you already had data in your profile table, you will need to rebuild the index by executing the following query:
REPAIR TABLE `profile` QUICK;