Compare commits

...

112 commits

Author SHA1 Message Date
Matthew Exon 942e5135da Mailstream: respect blocked/ignored/collapsed contact settings 2024-08-18 15:52:52 +02:00
Matthew Exon 7f4a2ff5d0 More comprehensible check for root user contact 2024-08-18 15:52:52 +02:00
Matthew Exon bc5604137d Revert "log uid but ignore results"
This reverts commit 0f5ba218f6.
2024-08-18 15:52:51 +02:00
Matthew Exon dec73dca94 Another attempt to resolve local urls 2024-08-18 15:52:51 +02:00
Matthew Exon fafe6a2ea4 a bit more defensiveness about add_retriever_item 2024-08-18 15:52:51 +02:00
Matthew Exon 13f3b44522 globalise urls now handles relative urls 2024-08-18 15:52:51 +02:00
Matthew Exon c7cb79c495 globalise_urls works better when retrospectively applying 2024-08-18 15:52:51 +02:00
Matthew Exon ed49acf61c fix whitespace 2024-08-18 15:52:51 +02:00
Matthew Exon fb944ccc06 Fix broken images that have been broken for ages 2024-08-18 15:52:51 +02:00
Matthew Exon 8e96300d1b adaptation for 2024.03 2024-08-18 15:52:51 +02:00
Matthew Exon 06a1d16f8d adaptation for 2024.03 2024-08-18 15:52:51 +02:00
Matthew Exon bf89256402 trying to get phototrack to work 2024-08-18 15:52:51 +02:00
Matthew Exon 21bf8ee05d some more robust mailstream stuff 2024-08-18 15:52:51 +02:00
Matthew Exon 95f4c67613 debugging some issues 2024-08-18 15:52:51 +02:00
Matthew Exon 0404f2155b more overdue adaptations 2024-08-18 15:52:51 +02:00
Matthew Exon 852b3706be some changes that were long overdue 2024-08-18 15:52:51 +02:00
Matthew Exon 5f4a1264ae more adaption to latest release 2024-08-18 15:52:51 +02:00
Matthew Exon 663cd44b89 adapt to latest release 2024-08-18 15:52:51 +02:00
Matthew Exon 2d02d8817c log uid but ignore results 2024-08-18 15:52:51 +02:00
Matthew Exon ce75886f9d remove duplicate use directive 2024-08-18 15:52:51 +02:00
Matthew Exon d1288d997d fix contact photo menu callback really 2024-08-18 15:52:51 +02:00
Matthew Exon 5a7b47aa18 fix contact photo menu callback 2024-08-18 15:52:51 +02:00
Matthew Exon 7180ae4033 replace local_user 2024-08-18 15:52:51 +02:00
Michael 156654e076 The priority is now a class constant 2024-08-18 15:52:51 +02:00
Matthew Exon 2eb33ad1e1 Add missing use statement 2024-08-18 15:52:51 +02:00
Matthew Exon 3c3b399a3d add types to parameters 2024-08-18 15:52:51 +02:00
Matthew Exon 48838999d9 fix order of upgrade commands 2024-08-18 15:52:51 +02:00
Matthew Exon a8c16bab1c add log lines to install 2024-08-18 15:52:51 +02:00
Matthew Exon bc04aea856 Fix length of keys 2024-08-18 15:52:51 +02:00
Matthew Exon 6d708e5246 Use new hook registration calls 2024-08-18 15:52:51 +02:00
Matthew Exon d5d909d460 Update to correct collation mode 2024-08-18 15:52:51 +02:00
Matthew Exon 3860957144 Use separate album and repair dox for ces 2024-08-18 15:52:51 +02:00
Matthew Exon be31b42ad9 fix comment 2024-08-18 15:52:51 +02:00
Matthew Exon 076e1a0c51 correct use of fetchFull 2024-08-18 15:52:51 +02:00
Matthew Exon a4ff47be01 fix argv stuff 2024-08-18 15:52:51 +02:00
Matthew Exon 19e0e8c2a3 fix argv stuff 2024-08-18 15:52:51 +02:00
Matthew Exon 6babcd4ae1 use new temppath function 2024-08-18 15:52:51 +02:00
Matthew Exon f0217ccc01 fix sql syntax 2024-08-18 15:52:51 +02:00
Matthew Exon 04d5877bc7 improvements 2024-08-18 15:52:51 +02:00
Matthew Exon 6c45bb5423 syntax errors 2024-08-18 15:52:51 +02:00
Matthew Exon 5f05407cab syntax errors 2024-08-18 15:52:51 +02:00
Matthew Exon 9a7088f9ab syntax errors 2024-08-18 15:52:51 +02:00
Matthew Exon ff0d43aaa9 syntax errors 2024-08-18 15:52:51 +02:00
Matthew Exon 4966d55ca1 this is more correcter 2024-08-18 15:52:51 +02:00
Matthew Exon b9e89e0445 this is more correct 2024-08-18 15:52:51 +02:00
Matthew Exon 5c0cb50313 another migrated function 2024-08-18 15:52:51 +02:00
Matthew Exon 30fb631982 add anotehr check 2024-08-18 15:52:51 +02:00
Matthew Exon 27ea25dafc also update these queries 2024-08-18 15:52:51 +02:00
Matthew Exon 42ada8257d stray line 2024-08-18 15:52:51 +02:00
Matthew Exon ef2ddfca16 perhaps it should be this style 2024-08-18 15:52:51 +02:00
Matthew Exon 9aa1ac4a1e attempt to handle one error 2024-08-18 15:52:51 +02:00
Matthew Exon d5414aeb52 new style of http request 2024-08-18 15:52:51 +02:00
Matthew Exon 7a77543b53 switch to new way of executing SQL 2024-08-18 15:52:51 +02:00
Matthew Exon 4fdef2b5aa switch to new way of executing SQL 2024-08-18 15:52:51 +02:00
Matthew Exon 7df1027c90 switch to new way of executing SQL 2024-08-18 15:52:51 +02:00
Matthew Exon 1a2752c620 sync with submitted 2024-08-18 15:52:51 +02:00
Matthew Exon 7d35d9a498 error checking in retriever 2024-08-18 15:52:51 +02:00
Matthew Exon b109a38f97 fix another stupid mistake 2024-08-18 15:52:51 +02:00
Matthew Exon 147825d666 fix another stupid mistake 2024-08-18 15:52:51 +02:00
Matthew Exon c4e833581e Detect an error in mailstream 2024-08-18 15:52:51 +02:00
Matthew Exon ef8d7703e5 fixed another obvious mistake 2024-08-18 15:52:51 +02:00
Matthew Exon 708dc4264d Fix a typo 2024-08-18 15:52:51 +02:00
Matthew Exon a0b2ef3d04 another check for empty results 2024-08-18 15:52:51 +02:00
Matthew Exon 847e3598a5 Adapt Item methods to Post methods 2024-08-18 15:52:51 +02:00
Matthew Exon dfe225b856 Remove binary field from httpRequest 2024-08-18 15:52:51 +02:00
Matthew Exon c48be6270a Replace fetchUrlFull with HTTPRequest version 2024-08-18 15:52:51 +02:00
Matthew Exon aa15745825 Remove unneeded get_app 2024-08-18 15:52:51 +02:00
Matthew Exon 056d455af8 Fix page assembly 2024-08-18 15:52:51 +02:00
Matthew Exon b659f65c2b Update with base url changes and strict key requirements 2024-08-18 15:52:51 +02:00
Matthew Exon c4be01c084 Further updates to 2020.03 2024-08-18 15:52:51 +02:00
Matthew Exon 3f9e848f54 Use new L10n thing 2024-08-18 15:52:51 +02:00
Matthew Exon 14281e6e8f Update to new module structure 2024-08-18 15:52:51 +02:00
Matthew Exon 1f363a9c6f maybe this way works better 2024-08-18 15:52:51 +02:00
Matthew Exon 0eb6222456 New way of doing baseurl 2024-08-18 15:52:51 +02:00
Matthew Exon 9d89f56011 Missing class 2024-08-18 15:52:51 +02:00
Matthew Exon 76c9cbb500 Update for new version 2024-08-18 15:52:51 +02:00
Matthew Exon 846081d362 Fix bug in phototrack 2024-08-18 15:52:51 +02:00
Matthew Exon 0eba27bc30 remove help section if images not allowed 2024-08-18 15:52:51 +02:00
Matthew Exon f7aa935071 Almost finished, maybe not working 2024-08-18 15:52:51 +02:00
Matthew Exon d068bcb010 working much better 2024-08-18 15:52:51 +02:00
Matthew Exon dffc20b969 I think this works 2024-08-18 15:52:51 +02:00
Matthew Exon 2a40b9b1d6 small addition 2024-08-18 15:52:51 +02:00
Matthew Exon 084303eae1 small cleanup 2024-08-18 15:52:51 +02:00
Matthew Exon 90c4f7efe9 working much better 2024-08-18 15:52:51 +02:00
Matthew Exon 090cc026ef maybe broken again 2024-08-18 15:52:51 +02:00
Matthew Exon d834b4b2e2 Now retriever works again 2024-08-18 15:52:51 +02:00
Matthew Exon 045efb7421 extensive refactoring 2024-08-18 15:52:51 +02:00
Matthew Exon f0b55c3331 retriever tweaks 2024-08-18 15:52:51 +02:00
Matthew Exon e563cdd3cd Add phototrack and publicise 2024-08-18 15:52:51 +02:00
Matthew Exon 7d31da3893 configurable number of requests 2024-08-18 15:52:51 +02:00
Matthew Exon 3632c8a789 update version number 2024-08-18 15:52:51 +02:00
Matthew Exon eae745d348 Stuff in retriever 2024-08-18 15:52:51 +02:00
Matthew Exon 95539914e9 fixed image regex 2024-08-18 15:52:51 +02:00
Matthew Exon ef1aafef2d more dba stuff 2024-08-18 15:52:51 +02:00
Matthew Exon 621cb0b4d3 fakerei2 2024-08-18 15:52:51 +02:00
Matthew Exon 28539b8516 Fix bugs in retriever retrospective stuff 2024-08-18 15:52:51 +02:00
Matthew Exon 640c0f0da9 more retriever stuff 2024-08-18 15:52:51 +02:00
Administrator 9c8a531b78 Fix retriever database problems 2024-08-18 15:52:51 +02:00
Matthew Exon 09fe9840bb retriever stuff 2024-08-18 15:52:51 +02:00
Matthew Exon eb3500ff7c Change logging functions 2024-08-18 15:52:51 +02:00
Matthew Exon fdb6afd3a5 Improvement 2024-08-18 15:52:51 +02:00
Administrator 9b156fde2c this is working OK 2024-08-18 15:52:51 +02:00
Matthew Exon 6146ecbab4 fixed a bug and commented on another 2024-08-18 15:52:51 +02:00
Matthew Exon 3bbe03a21a fix 2024-08-18 15:52:51 +02:00
Matthew Exon 9719138a46 tentative database work 2024-08-18 15:52:51 +02:00
Matthew Exon f7941d32c6 More preparation for persistent cookies 2024-08-18 15:52:51 +02:00
Matthew Exon b15bc849fa beginnings of persistent cookiejar support 2024-08-18 15:52:51 +02:00
Matthew Exon 7769022613 now working retriever 2024-08-18 15:52:51 +02:00
Matthew Exon 7ac2b7a641 more fixes 2024-08-18 15:52:51 +02:00
Matthew Exon f60df15a4f more fixes 2024-08-18 15:52:51 +02:00
Matthew Exon b71968026c Fixes for retriever 2024-08-18 15:52:51 +02:00
Matthew Exon 9cc85eaa5b Latest version of retriever 2024-08-18 15:52:51 +02:00
15 changed files with 4199 additions and 230 deletions

View file

@ -180,5 +180,5 @@ function ifttt_message($uid, $item)
$link = hash('ripemd128', $item['msg']); $link = hash('ripemd128', $item['msg']);
} }
Post\Delayed::add($link, $post, Worker::PRIORITY_MEDIUM, Post\Delayed::PREPARED); Post\Delayed::add($link, $post, Worker::PRIORITY_MEDIUM, Post\Delayed::UNPREPARED);
} }

View file

@ -118,13 +118,46 @@ function mailstream_send_hook(array $data)
return; return;
} }
$user = User::getById($item['uid']); if ($item['deleted']) {
if (empty($user)) { Logger::debug('mailstream_send_hook skipping deleted item', ['guid' => $item['guid']]);
Logger::error('mailstream_send_hook could not fund user', ['uid' => $item['uid']]);
return; return;
} }
if (!mailstream_send($data['message_id'], $item, $user)) { $user = User::getById($item['uid']);
if (empty($user)) {
Logger::error('mailstream_send_hook could not find user', ['uid' => $item['uid']]);
return;
}
$author = DBA::selectFirst('contact', ['nick', 'blocked', 'uri-id'], ['id' => $data['author-id'], 'self' => false]);
if (!DBA::isResult($author)) {
Logger::error('mailstream_send_hook could not find author', ['guid' => $item['guid'], 'author-id' => $data['author-id']]);
return;
}
if ($author['blocked']) {
Logger::info('mailstream_send_hook author is blocked', ['guid' => $item['guid'], 'author-id' => $data['author-id']]);
return;
}
$collapsed = false;
$user_contact = DBA::selectFirst('user-contact', ['cid', 'blocked', 'ignored', 'collapsed'], ['uid' => $item['uid'], 'uri-id' => $item['author-uri-id']]);
if (!DBA::isResult($user_contact)) {
$user_contact = DBA::selectFirst('user-contact', ['cid', 'blocked', 'ignored', 'collapsed'], ['uid' => $item['uid'], 'cid' => $item['author-id']]);
}
if (DBA::isResult($user_contact)) {
if ($user_contact['blocked']) {
Logger::info('mailstream_send_hook author is blocked', ['guid' => $item['guid'], 'cid' => $user_contact['cid']]);
return;
}
if ($user_contact['ignored']) {
Logger::info('mailstream_send_hook author is ignored', ['guid' => $item['guid'], 'cid' => $user_contact['cid']]);
return;
}
if ($user_contact['collapsed']) {
$collapsed = true;
}
}
if (!mailstream_send($data['message_id'], $item, $user, $collapsed)) {
Logger::debug('mailstream_send_hook send failed, will retry', $data); Logger::debug('mailstream_send_hook send failed, will retry', $data);
if (!Worker::defer()) { if (!Worker::defer()) {
Logger::error('mailstream_send_hook failed and could not defer', $data); Logger::error('mailstream_send_hook failed and could not defer', $data);
@ -144,12 +177,12 @@ function mailstream_post_hook(array &$item)
{ {
mailstream_check_version(); mailstream_check_version();
if (!DI::pConfig()->get($item['uid'], 'mailstream', 'enabled')) { if ($item['uid'] === 0) {
Logger::debug('mailstream: not enabled.', ['item' => $item['id'], ' uid ' => $item['uid']]); Logger::debug('mailstream: root user, skipping item ' . $item['id']);
return; return;
} }
if (!$item['uid']) { if (!DI::pConfig()->get($item['uid'], 'mailstream', 'enabled')) {
Logger::debug('mailstream: no uid for item ' . $item['id']); Logger::debug('mailstream: not enabled.', ['item' => $item['id'], ' uid ' => $item['uid']]);
return; return;
} }
if (!$item['contact-id']) { if (!$item['contact-id']) {
@ -180,6 +213,7 @@ function mailstream_post_hook(array &$item)
$send_hook_data = [ $send_hook_data = [
'uid' => $item['uid'], 'uid' => $item['uid'],
'contact-id' => $item['contact-id'], 'contact-id' => $item['contact-id'],
'author-id' => $item['author-id'],
'uri' => $item['uri'], 'uri' => $item['uri'],
'message_id' => $message_id, 'message_id' => $message_id,
'tries' => 0, 'tries' => 0,
@ -364,10 +398,11 @@ function mailstream_subject(array $item): string
* @param string $message_id ID of the message (RFC 1036) * @param string $message_id ID of the message (RFC 1036)
* @param array $item content of the item * @param array $item content of the item
* @param array $user results from the user table * @param array $user results from the user table
* @param bool $collapsed true if the content should be hidden
* *
* @return bool True if this message has been completed. False if it should be retried. * @return bool True if this message has been completed. False if it should be retried.
*/ */
function mailstream_send(string $message_id, array $item, array $user): bool function mailstream_send(string $message_id, array $item, array $user, bool $collapsed): bool
{ {
if (!is_array($item)) { if (!is_array($item)) {
Logger::error('mailstream_send item is empty', ['message_id' => $message_id]); Logger::error('mailstream_send item is empty', ['message_id' => $message_id]);
@ -386,10 +421,16 @@ function mailstream_send(string $message_id, array $item, array $user): bool
require_once (dirname(__file__) . '/phpmailer/class.phpmailer.php'); require_once (dirname(__file__) . '/phpmailer/class.phpmailer.php');
$item['body'] = Post\Media::addAttachmentsToBody($item['uri-id'], $item['body']); if ($collapsed) {
$item['body'] = DI::l10n()->t('Content from %s is collapsed', $item['author-name']);
} else {
$item['body'] = Post\Media::addAttachmentsToBody($item['uri-id'], $item['body']);
}
$attachments = []; $attachments = [];
mailstream_do_images($item, $attachments); if (!$collapsed) {
mailstream_do_images($item, $attachments);
}
$frommail = DI::config()->get('mailstream', 'frommail'); $frommail = DI::config()->get('mailstream', 'frommail');
if ($frommail == '') { if ($frommail == '') {
$frommail = 'friendica@localhost.local'; $frommail = 'friendica@localhost.local';

23
phototrack/database.sql Normal file
View file

@ -0,0 +1,23 @@
CREATE TABLE IF NOT EXISTS `phototrack_photo_use` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`resource-id` char(64) NOT NULL,
`table` char(64) NOT NULL,
`field` char(64) NOT NULL,
`row-id` int(11) NOT NULL,
`checked` timestamp NOT NULL DEFAULT now(),
PRIMARY KEY (`id`),
INDEX `resource-id` (`resource-id`),
INDEX `row` (`table`,`field`,`row-id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE IF NOT EXISTS `phototrack_row_check` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`table` char(64) NOT NULL,
`row-id` int(11) NOT NULL,
`checked` timestamp NOT NULL DEFAULT now(),
PRIMARY KEY (`id`),
INDEX `row` (`table`,`row-id`),
INDEX `checked` (`checked`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
SELECT TRUE

272
phototrack/phototrack.php Normal file
View file

@ -0,0 +1,272 @@
<?php
/**
* Name: Photo Track
* Description: Track which photos are actually being used and delete any others
* Version: 1.0
* Author: Matthew Exon <http://mat.exon.name>
*/
/*
* List of tables and the fields that are checked:
*
* contact: photo thumb micro about
* fcontact: photo
* fsuggest: photo
* gcontact: photo about
* item: body
* item-content: body
* mail: from-photo
* notify: photo
* profile: photo thumb about
*/
use Friendica\Core\Addon;
use Friendica\Core\Logger;
use Friendica\Object\Image;
use Friendica\Database\DBA;
use Friendica\Util\Images;
use Friendica\Util\DateTimeFormat;
use Friendica\DI;
if (!defined('PHOTOTRACK_DEFAULT_BATCH_SIZE')) {
define('PHOTOTRACK_DEFAULT_BATCH_SIZE', 1000);
}
// Time in *minutes* between searching for photo uses
if (!defined('PHOTOTRACK_DEFAULT_SEARCH_INTERVAL')) {
define('PHOTOTRACK_DEFAULT_SEARCH_INTERVAL', 10);
}
function phototrack_install() {
global $db;
Addon::registerHook('post_local_end', 'addon/phototrack/phototrack.php', 'phototrack_post_local_end');
Addon::registerHook('post_remote_end', 'addon/phototrack/phototrack.php', 'phototrack_post_remote_end');
Addon::registerHook('notifier_end', 'addon/phototrack/phototrack.php', 'phototrack_notifier_end');
Addon::registerHook('cron', 'addon/phototrack/phototrack.php', 'phototrack_cron');
if (DI::config()->get('phototrack', 'dbversion') != '0.1') {
$schema = file_get_contents(dirname(__file__).'/database.sql');
$arr = explode(';', $schema);
foreach ($arr as $a) {
if (!DBA::e($a)) {
Logger::warning('Unable to create database table: ' . DBA::errorMessage());
return;
}
}
DI::config()->set('phototrack', 'dbversion', '0.1');
}
}
function phototrack_uninstall() {
Addon::unregisterHook('post_local_end', 'addon/phototrack/phototrack.php', 'phototrack_post_local_end');
Addon::unregisterHook('post_remote_end', 'addon/phototrack/phototrack.php', 'phototrack_post_remote_end');
Addon::unregisterHook('notifier_end', 'addon/phototrack/phototrack.php', 'phototrack_notifier_end');
Addon::unregisterHook('cron', 'addon/phototrack/phototrack.php', 'phototrack_cron');
}
function phototrack_module() {}
function phototrack_finished_row($table, $id) {
$existing = DBA::selectFirst('phototrack_row_check', ['id'], ['table' => $table, 'row-id' => $id]);
if (!is_bool($existing)) {
DBA::update('phototrack_row_check', ['checked' => DateTimeFormat::utcNow()], ['table' => $table, 'row-id' => $id]);
}
else {
DBA::insert('phototrack_row_check', ['table' => $table, 'row-id' => $id, 'checked' => DateTimeFormat::utcNow()]);
}
}
function phototrack_photo_use($photo, $table, $field, $id) {
Logger::debug('@@@ phototrack_photo_use ' . $photo);
foreach (Images::supportedTypes() as $m => $e) {
$photo = str_replace(".$e", '', $photo);
}
if (substr($photo, -2, 1) == '-') {
$resolution = intval(substr($photo,-1,1));
$photo = substr($photo,0,-2);
}
if (strlen($photo) != 32) {
return;
}
$r = DBA::selectFirst('photo', ['resource-id'], ['resource-id' => $photo]);
if (!DBA::isResult($r)) {
return;
}
$rid = $r['resource-id'];
$existing = DBA::selectFirst('phototrack_photo_use', ['id'], ['resource-id' => $rid, 'table' => $table, 'field' => $field, 'row-id' => $id]);
if (DBA::isResult($existing)) {
DBA::update('phototrack_photo_use', ['checked' => DateTimeFormat::utcNow()], ['resource-id' => $rid, 'table' => $table, 'field' => $field, 'row-id' => $id]);
}
else {
DBA::insert('phototrack_photo_use', ['resource-id' => $rid, 'table' => $table, 'field' => $field, 'row-id' => $id, 'checked' => DateTimeFormat::utcNow()]);
}
}
function phototrack_check_field_url($a, $table, $id_field, $field, $id, $url) {
Logger::info('@@@ phototrack_check_field_url table ' . $table . ' id_field ' . $id_field . ' field ' . $field . ' id ' . $id . ' url ' . $url);
$baseurl = DI::baseUrl()->get(true);
if (strpos($url, $baseurl) === FALSE) {
return;
}
else {
$url = substr($url, strlen($baseurl));
Logger::info('@@@ phototrack_check_field_url funny url stuff ' . $url . ' base ' . $baseurl);
}
if (strpos($url, '/photo/') === FALSE) {
return;
}
else {
$url = substr($url, strlen('/photo/'));
Logger::info('@@@ phototrack_check_field_url more url stuff ' . $url);
}
if (preg_match('/([0-9a-z]{32})/', $url, $matches)) {
$rid = $matches[0];
Logger::info('@@@ phototrack_check_field_url rid ' . $rid);
phototrack_photo_use($rid, $table, $field, $id);
}
}
function phototrack_check_field_bbcode($a, $table, $id_field, $field, $id, $value) {
Logger::info('@@@ phototrack_check_field_url table ' . $table . ' id_field ' . $id_field . ' field ' . $field . ' id ' . $id . ' value ' . $value);
$baseurl = DI::baseUrl()->get(true);
$matches = array();
preg_match_all("/\[img(\=([0-9]*)x([0-9]*))?\](.*?)\[\/img\]/ism", $value, $matches);
foreach ($matches[4] as $url) {
phototrack_check_field_url($a, $table, $id_field, $field, $id, $url);
}
}
function phototrack_post_local_end(&$a, &$item) {
phototrack_check_row($a, 'item', 'id', $item);
phototrack_check_row($a, 'item-content', 'id', $item);
}
function phototrack_post_remote_end(&$a, &$item) {
phototrack_check_row($a, 'item', 'id', $item);
phototrack_check_row($a, 'item-content', 'id', $item);
}
function phototrack_notifier_end($item) {
}
function phototrack_check_row($a, $table, $id_field, $row) {
switch ($table) {
case 'post-content':
$fields = array(
'body' => 'bbcode');
break;
case 'contact':
$fields = array(
'photo' => 'url',
'thumb' => 'url',
'micro' => 'url',
'about' => 'bbcode');
break;
case 'fcontact':
$fields = array(
'photo' => 'url');
break;
case 'fsuggest':
$fields = array(
'photo' => 'url');
break;
case 'gcontact':
$fields = array(
'photo' => 'url',
'about' => 'bbcode');
break;
default: $fields = array(); break;
}
foreach ($fields as $field => $type) {
switch ($type) {
case 'bbcode': phototrack_check_field_bbcode($a, $table, $id_field, $field, $row['id'], $row[$field]); break;
case 'url': phototrack_check_field_url($a, $table, $id_field, $field, $row['id'], $row[$field]); break;
}
}
phototrack_finished_row($table, $row['id']);
}
function phototrack_batch_size() {
$batch_size = DI::config()->get('phototrack', 'batch_size');
if ($batch_size > 0) {
return $batch_size;
}
return PHOTOTRACK_DEFAULT_BATCH_SIZE;
}
function phototrack_search_table($a, $table, $id_field) {
$batch_size = phototrack_batch_size();
$rows = DBA::p("SELECT `$table`.* FROM `$table` LEFT OUTER JOIN phototrack_row_check ON ( phototrack_row_check.`table` = '$table' AND phototrack_row_check.`row-id` = `$table`.$id_field ) WHERE ( ( phototrack_row_check.checked IS NULL ) OR ( phototrack_row_check.checked < DATE_SUB(NOW(), INTERVAL 1 MONTH) ) ) ORDER BY phototrack_row_check.checked LIMIT $batch_size");
if (DBA::isResult($rows)) {
while ($row = DBA::fetch($rows)) {
phototrack_check_row($a, $table, $id_field, $row);
}
}
$r = DBA::p("SELECT COUNT(*) FROM `$table` LEFT OUTER JOIN phototrack_row_check ON ( phototrack_row_check.`table` = '$table' AND phototrack_row_check.`row-id` = `$table`.$id_field ) WHERE ( ( phototrack_row_check.checked IS NULL ) OR ( phototrack_row_check.checked < DATE_SUB(NOW(), INTERVAL 1 MONTH) ) )");
Logger::info("@@@ phototrack_search_table " . print_r(DBA::fetch($r)));
$remaining = DBA::fetch($r)['count'];
Logger::info('phototrack: searched ' . DBA::numRows($rows) . ' rows in table ' . $table . ', ' . $remaining . ' still remaining to search');
return $remaining;
}
function phototrack_cron_time() {
$prev_remaining = DI::config()->get('phototrack', 'remaining_items');
if ($prev_remaining > 10 * phototrack_batch_size()) {
Logger::debug('phototrack: more than ' . (10 * phototrack_batch_size()) . ' items remaining');
return true;
}
$last = DI::config()->get('phototrack', 'last_search');
$search_interval = intval(DI::config()->get('phototrack', 'search_interval'));
if (!$search_interval) {
$search_interval = PHOTOTRACK_DEFAULT_SEARCH_INTERVAL;
}
if ($last) {
$next = $last + ($search_interval * 60);
if ($next > time()) {
Logger::debug('phototrack: search interval not reached');
return false;
}
}
Logger::debug('@@@ phototrack: search interval reached last ' . $last . ' search interval ' . $search_interval);
return true;
}
function phototrack_cron($a, $b) {
return; // @@@ something is broken
if (!phototrack_cron_time()) {
return;
}
DI::config()->set('phototrack', 'last_search', time());
$remaining = 0;
$remaining += phototrack_search_table($a, 'post-content', 'uri-id');
$remaining += phototrack_search_table($a, 'contact', 'id');
$remaining += phototrack_search_table($a, 'fcontact', 'id');
$remaining += phototrack_search_table($a, 'fsuggest', 'id');
$remaining += phototrack_search_table($a, 'gcontact', 'id');
DI::config()->set('phototrack', 'remaining_items', $remaining);
if ($remaining === 0) {
phototrack_tidy();
}
}
function phototrack_tidy() {
$batch_size = phototrack_batch_size();
DBA::e('CREATE TABLE IF NOT EXISTS `phototrack-temp` (`resource-id` char(255) not null)');
DBA::e('INSERT INTO `phototrack-temp` SELECT DISTINCT(`resource-id`) FROM photo WHERE photo.`created` < DATE_SUB(NOW(), INTERVAL 2 MONTH)');
$rows = DBA::p('SELECT `phototrack-temp`.`resource-id` FROM `phototrack-temp` LEFT OUTER JOIN phototrack_photo_use ON (`phototrack-temp`.`resource-id` = phototrack_photo_use.`resource-id`) WHERE phototrack_photo_use.id IS NULL limit ' . /*$batch_size*/1000);
if (DBA::isResult($rows)) {
foreach ($rows as $row) {
Logger::debug('phototrack: remove photo ' . $row['resource-id']);
DBA::e('DELETE FROM photo WHERE `resource-id` = "' . $row['resource-id'] . '"');
}
Logger::info('phototrack_tidy: deleted ' . DBA::numRows($rows) . ' photos');
}
DBA::e('DROP TABLE `phototrack-temp`');
$rows = DBA::p('SELECT id FROM phototrack_photo_use WHERE checked < DATE_SUB(NOW(), INTERVAL 2 MONTH)');
foreach ($rows as $row) {
DBA::e( 'DELETE FROM phototrack_photo_use WHERE id = ' . $row['id']);
}
Logger::info('phototrack_tidy: deleted ' . DBA::numRows($rows) . ' phototrack_photo_use rows');
}

11
publicise/publicise.php Normal file
View file

@ -0,0 +1,11 @@
"SELECT `uid` FROM `contact` WHERE `id` = %d AND `reason` = 'publicise'", intval($item['contact-id']));
if (!$r1) {
return;
}
Logger::debug('Publicise: moving to wall: ' . $item['uid'] . ' ' . $item['contact-id'] . ' ' . $item['uri']);
$item['type'] = 'wall';
$item['wall'] = 1;
$item['private'] = 0;
}

View file

@ -0,0 +1,39 @@
{{*
* AUTOMATICALLY GENERATED TEMPLATE
* DO NOT EDIT THIS FILE, CHANGES WILL BE OVERWRITTEN
*
*}}
<form method="post">
<table>
<thead>
<tr>
<th>{{$feed_t}}</th>
<th>{{$publicised_t}}</th>
<th>{{$comments_t}}</th>
<th>{{$expire_t}}</th>
</tr>
</thead>
<tbody>
{{foreach $feeds as $f}}
<tr>
<td>
<a href="{{$f.url}}">
<img style="vertical-align:middle" src='{{$f.micro}}'>
<span style="margin-left:1em">{{$f.name}}</span>
</a>
</td>
<td>
{{include file="field_yesno.tpl" field=$f.enabled}}
</td>
<td>
{{include file="field_yesno.tpl" field=$f.comments}}
</td>
<td>
<input name="publicise-expire-{{$f.id}}" value="{{$f.expire}}">
</td>
</tr>
{{/foreach}}
</tbody>
</table>
<input type="submit" size="70" value="{{$submit_t}}">
</form>

42
retriever/database.sql Normal file
View file

@ -0,0 +1,42 @@
CREATE TABLE IF NOT EXISTS `retriever_rule` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`uid` int(11) NOT NULL,
`contact-id` int(11) NOT NULL,
`data` mediumtext NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `uid` (`uid`),
KEY `contact-id` (`contact-id`)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
CREATE TABLE IF NOT EXISTS `retriever_item` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`item-uri` varbinary(255) NOT NULL,
`item-uid` int(10) unsigned NOT NULL DEFAULT '0',
`contact-id` int(10) unsigned NOT NULL DEFAULT '0',
`resource` int(11) NOT NULL,
`finished` tinyint(1) unsigned NOT NULL DEFAULT '0',
KEY `resource` (`resource`),
KEY `finished` (`finished`),
KEY `item-uid` (`item-uid`),
KEY `all` (`item-uri`, `item-uid`, `contact-id`),
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
CREATE TABLE IF NOT EXISTS `retriever_resource` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`item-uid` int(10) unsigned NOT NULL DEFAULT '0',
`contact-id` int(10) unsigned NOT NULL DEFAULT '0',
`type` char(255) NULL DEFAULT NULL,
`binary` int(1) NOT NULL DEFAULT 0,
`url` varbinary(700) NOT NULL,
`created` timestamp NOT NULL DEFAULT now(),
`completed` timestamp NULL DEFAULT NULL,
`last-try` timestamp NULL DEFAULT NULL,
`num-tries` int(11) NOT NULL DEFAULT 0,
`data` mediumblob NULL DEFAULT NULL,
`http-code` smallint(1) unsigned NULL DEFAULT NULL,
`redirect-url` varbinary(700) NOT NULL,
KEY `url` (`url`),
KEY `completed` (`completed`),
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

1070
retriever/retriever.php Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,9 @@
{{*
* AUTOMATICALLY GENERATED TEMPLATE
* DO NOT EDIT THIS FILE, CHANGES WILL BE OVERWRITTEN
*
*}}
{{include file="field_input.tpl" field=$downloads_per_cron}}
{{include file="field_checkbox.tpl" field=$allow_images}}
<div class="submit"><input type="submit" name="page_site" value="{{$submit}}"></div>

View file

@ -0,0 +1,24 @@
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes" version="4.0"/>
<xsl:template match="text()"/>
{{function clause_xpath}}{{if !$clause.attribute}}{{$clause.element}}{{elseif $clause.attribute == 'class'}}{{$clause.element}}[contains(concat(' ', normalize-space(@class), ' '), '{{$clause.value}}')]{{else}}{{$clause.element}}[@{{$clause.attribute}}='{{$clause.value}}']{{/if}}{{/function}}
{{foreach $spec.include as $clause}}
<xsl:template match="{{clause_xpath clause=$clause}}">
<xsl:copy>
<xsl:apply-templates select="node()|@*" mode="remove"/>
</xsl:copy>
</xsl:template>{{/foreach}}
{{foreach $spec.exclude as $clause}}
<xsl:template match="{{clause_xpath clause=$clause}}" mode="remove"/>{{/foreach}}
<xsl:template match="node()|@*" mode="remove">
<xsl:copy>
<xsl:apply-templates select="node()|@*" mode="remove"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

View file

@ -0,0 +1,31 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- attempt to replace relative URLs with absolute URLs -->
<!-- http://stackoverflow.com/questions/3824631/replace-href-value-in-anchor-tags-of-html-using-xslt -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes" version="4.0"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*/@src[starts-with(.,'.')]">
<xsl:attribute name="src">
<xsl:value-of select="concat('{{$dirurl}}/',.)"/>
</xsl:attribute>
</xsl:template>
<xsl:template match="*/@src[starts-with(.,'/')]">
<xsl:attribute name="src">
<xsl:value-of select="concat('{{$rooturl}}',.)"/>
</xsl:attribute>
</xsl:template>
<xsl:template match="*/@src[not(starts-with(.,'/')) and not(contains(.,':'))]">
<xsl:attribute name="src">
<xsl:value-of select="concat('{{$dirurl}}',.)"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>

View file

@ -0,0 +1,163 @@
<h2>Retriever Plugin Help</h2>
<p>
This plugin replaces the short excerpts you normally get in RSS feeds
with the full content of the article from the source website. You
specify which part of the page you're interested in with a set of
rules. When each item arrives, the plugin downloads the full page
from the website, extracts content using the rules, and replaces the
original article.
</p>
<p>
There's a few reasons you may want to do this. The source website
might be slow or overloaded. The source website might be
untrustworthy, in which case using Friendica to scrub the HTML is a
good idea. You might be on a LAN that blacklists certain websites.
It also works neatly with the mailstream plugin, allowing you to read
a news stream comfortably without needing continuous Internet
connectivity.
</p>
<p>
However, setting up retriever can be quite tricky since it depends on
the internal design of the website. That was designed to make life
easy for the website's developers, not for you. You'll need to have
some familiarity with HTML, and be willing to adapt when the website
suddenly changes everything without notice.
</p>
<h3>Configuring Retriever for a feed</h3>
<p>
To set up retriever for an RSS feed, go to the "Contacts" page and
find your feed. Then click on the drop-down menu on the contact.
Select "Retriever" to get to the retriever configuration.
</p>
<p>
The "Include" configuration section specifies parts of the page to
include in the article. Each row has three components:
</p>
<ul>
<li>An HTML tag (e.g. "div", "span", "p")</li>
<li>An attribute (usually "class" or "id")</li>
<li>A value for the attribute</li>
</ul>
<p>
A simple case is when the article is wrapped in a "div" element:
</p>
<pre>
...
&lt;div class="ArticleWrapper"&gt;
&lt;h2&gt;Man Bites Dog&lt;/h2&gt;
&lt;img src="mbd.jpg"&gt;
&lt;p&gt;
Residents of the sleepy community of Nowheresville were
shocked yesterday by the sight of creepy local weirdo Jim
McOddman assaulting innocent local dog Snufflekins with his
false teeth.
&lt;/p&gt;
...
&lt;/div&gt;
...
</pre>
<p>
You then specify the tag "div", attribute "class", and value
"ArticleWrapper". Everything else in the page, such as navigation
panels and menus and footers and so on, will be discarded. If there
is more than one section of the page you want to include, specify each
one on a separate row. If the matching section contains some sections
you want to remove, specify those in the "Exclude" section in the same
way.
</p>
<p>
Once you've got a configuration that you think will work, you can try
it out on some existing articles. Type a number into the
"Retrospectively Apply" box and click "Submit". After a while
(exactly how long depends on your system's cron configuration) the new
articles should be available.
</p>
<h3>Techniques</h3>
<p>
You can leave the attribute and value blank to include all the
corresponding elements with the specified tag name. You can also use
a tag name of just an asterisk ("*"), which will match any element type with the
specified attribute regardless of the tag.
</p>
<p>
Note that the "class" attribute is a special case. Many web page
templates will put multiple different classes in the same element,
separated by spaces. If you specify an attribute of "class" it will
match an element if any of its classes matches the specified value.
For example:
</p>
<pre>
&lt;div class="article breaking-news"&gt;
</pre>
<p>
In this case you can specify a value of "article", or "breaking-news".
You can also specify "article breaking-news", but that won't match if
the website suddenly changes to "breaking-news article", so that's not
recommended.
</p>
<p>
One useful trick you can try is using the website's "print" pages.
Many news sites have print versions of all their articles. These are
usually drastically simplified compared to the live website page.
Sometimes this is a good way to get the whole article when it's
normally split across multiple pages.
</p>
<p>
Hopefully the URL for the print page is a predictable variant of the
normal article URL. For example, an article URL like:
</p>
<pre>
http://www.newssite.com/article-8636.html
</pre>
<p>
...might have a print version at:
</p>
<pre>
http://www.newssite.com/print/article-8636.html
</pre>
<p>
To change the URL used to retrieve the page, use the "URL Pattern" and
"URL Replace" fields. The pattern is a regular expression matching
part of the URL to replace. In this case, you might use a pattern of
"/article" and a replace string of "/print/article". A common pattern
is simply a dollar sign ("$"), used to add the replace string to the end of the URL.
</p>
<h3>Background Processing</h3>
<p>
Note that retrieving and processing the articles can take some time,
so it's done in the background. Incoming articles will be marked as
invisible while they're in the process of being downloaded. If a URL
fails, the plugin will keep trying at progressively longer intervals
for up to a month, in case the website is temporarily overloaded or
the network is down.
</p>
{{if $allow_images}}
<h3>Retrieving Images</h3>
<p>
Retriever can also optionally download images and store them in the
local Friendica instance. Just check the "Download Images" box. You
can also download images in every item from your network, whether it's
an RSS feed or not. Go to the "Settings" page and
click <a href="$config">"Plugin settings"</a>. Then check the "All
Photos" box in the "Retriever Settings" section and click "Submit".
</p>
{{/if}}
<h2>Configure Feeds:</h2>
<div>
{{foreach $feeds as $feed}}
<div class="contact-entry-wrapper" id="contact-entry-wrapper-{{$feed.id}}">
<a href="{{$feed.url}} title="{{$feed.img_hover}}">
<div class="contact-entry-photo-wrapper">
<div class="contact-entry-photo mframe" id="contact-entry-photo-{{$feed.id}}">
<img src="{{$feed.thumb}}" {{$feed.sparkle}} alt="{{$feed.name}}"/>
</div>
</div>
<div class="contact-entry-desc">
<div class="contact-entry-name" id="contact-entry-name-{{$feed.id}}">
{{$feed.name}}
</div>
</div>
</a>
</div>
{{/foreach}}
</div>

View file

@ -0,0 +1,154 @@
<div class="settings-block">
<script language="javascript">
function retriever_add_row(id)
{
var tbody = document.getElementById(id);
var last = tbody.rows[tbody.childElementCount - 1];
var count = +last.id.replace(id + '-', '');
count++;
var row = document.createElement('tr');
row.id = id + '-' + count;
var cell1 = document.createElement('td');
var inptag = document.createElement('input');
inptag.name = row.id + '-element';
cell1.appendChild(inptag);
row.appendChild(cell1);
var cell2 = document.createElement('td');
var inpatt = document.createElement('input');
inpatt.name = row.id + '-attribute';
cell2.appendChild(inpatt);
row.appendChild(cell2);
var cell3 = document.createElement('td');
var inpval = document.createElement('input');
inpval.name = row.id + '-value';
cell3.appendChild(inpval);
row.appendChild(cell3);
var cell4 = document.createElement('td');
var butrem = document.createElement('input');
butrem.id = row.id + '-rem';
butrem.type = 'button';
butrem.onclick = function(){retriever_remove_row(id, count)};
butrem.value = '{{$remove_t}}';
cell4.appendChild(butrem);
row.appendChild(cell4);
tbody.appendChild(row);
}
function retriever_remove_row(id, number)
{
var tbody = document.getElementById(id);
var row = document.getElementById(id + '-' + number);
tbody.removeChild(row);
}
function retriever_toggle_url_block()
{
var pattern = document.querySelector("#id_retriever_pattern").parentNode;
if (document.querySelector("#id_retriever_modurl").checked) {
pattern.style.display = "block";
}
else {
pattern.style.display = "none";
}
var replace = document.querySelector("#id_retriever_replace").parentNode;
if (document.querySelector("#id_retriever_modurl").checked) {
replace.style.display = "block";
}
else {
replace.style.display = "none";
}
}
function retriever_toggle_cookiedata_block()
{
var div = document.querySelector("#id_retriever_cookiedata").parentNode;
if (document.querySelector("#id_retriever_storecookies").checked) {
div.style.display = "block";
}
else {
div.style.display = "none";
}
}
document.addEventListener('DOMContentLoaded', function() {
retriever_toggle_url_block();
document.querySelector("#id_retriever_modurl").addEventListener('change', retriever_toggle_url_block, false);
retriever_toggle_cookiedata_block();
document.querySelector("#id_retriever_storecookies").addEventListener('change', retriever_toggle_cookiedata_block, false);
}, false);
</script>
<h2>{{$title}}</h2>
<p><a href="{{$help}}">{{$help_t}}</a></p>
<form method="post">
<input type="hidden" name="id" value="{{$id}}">
{{include file="field_checkbox.tpl" field=$enable}}
<h3>{{$include_t}}:</h3>
<div>
<table>
<thead>
<tr><th>{{$tag_t}}</th><th>{{$attribute_t}}</th><th>{{$value_t}}</th></tr>
</thead>
<tbody id="retriever-include">
{{if $include}}
{{foreach $include as $k=>$m}}
<tr id="retriever-include-{{$k}}">
<td><input name="retriever-include-{{$k}}-element" value="{{$m.element}}"></td>
<td><input name="retriever-include-{{$k}}-attribute" value="{{$m.attribute}}"></td>
<td><input name="retriever-include-{{$k}}-value" value="{{$m.value}}"></td>
<td><input id="retrieve-include-{{$k}}-rem" type="button" onclick="retriever_remove_row('retriever-include', {{$k}})" value="{{$remove_t}}"></td>
</tr>
{{/foreach}}
{{else}}
<tr id="retriever-include-0">
<td><input name="retriever-include-0-element"></td>
<td><input name="retriever-include-0-attribute"></td>
<td><input name="retriever-include-0-value"></td>
<td><input id="retrieve-include-0-rem" type="button" onclick="retriever_remove_row('retriever-include', 0)" value="{{$remove_t}}"></td>
</tr>
{{/if}}
</tbody>
</table>
<input type="button" onclick="retriever_add_row('retriever-include')" value="{{$add_t}}">
</div>
<h3>{{$exclude_t}}:</h3>
<div>
<table>
<thead>
<tr><th>{{$tag_t}}</th><th>{{$attribute_t}}</th><th>{{$value_t}}</th></tr>
</thead>
<tbody id="retriever-exclude">
{{if $exclude}}
{{foreach $exclude as $k=>$r}}
<tr id="retriever-exclude-{{$k}}">
<td><input name="retriever-exclude-{{$k}}-element" value="{{$r.element}}"></td>
<td><input name="retriever-exclude-{{$k}}-attribute" value="{{$r.attribute}}"></td>
<td><input name="retriever-exclude-{{$k}}-value" value="{{$r.value}}"></td>
<td><input id="retrieve-exclude-{{$k}}-rem" type="button" onclick="retriever_remove_row('retriever-exclude', {{$k}})" value="{{$remove_t}}"></td>
</tr>
{{/foreach}}
{{else}}
<tr id="retriever-exclude-0">
<td><input name="retriever-exclude-0-element"></td>
<td><input name="retriever-exclude-0-attribute"></td>
<td><input name="retriever-exclude-0-value"></td>
<td><input id="retrieve-exclude-0-rem" type="button" onclick="retriever_remove_row('retriever-exclude', 0)" value="{{$remove_t}}"></td>
</tr>
{{/if}}
</tbody>
</table>
<input type="button" onclick="retriever_add_row('retriever-exclude')" value="{{$add_t}}">
</div>
{{include file="field_checkbox.tpl" field=$modurl}}
{{include file="field_input.tpl" field=$pattern}}
{{include file="field_input.tpl" field=$replace}}
{{if $allow_images}}
{{include file="field_checkbox.tpl" field=$images}}
{{/if}}
{{include file="field_textarea.tpl" field=$customxslt}}
{{include file="field_checkbox.tpl" field=$storecookies}}
{{include file="field_textarea.tpl" field=$cookiedata}}
{{include file="field_input.tpl" field=$retrospective}}
<input type="submit" size="70" value="{{$submit_t}}">
</form>
</div>

View file

@ -0,0 +1,5 @@
<p><a href="{{$help}}">Get Help</a></p>
{{if $allow_images}}
{{include file="field_checkbox.tpl" field=$allphotos}}
{{/if}}
{{include file="field_checkbox.tpl" field=$oembed}}

File diff suppressed because it is too large Load diff