Issue
I want to prevent 'get_file_contents', automatic 'curl' requests and automatic requests for scrape data in my project.
I don't want to do this with WAF or an external system.
How can I parse incoming requests using PHP only?
Solution
You can write your own robot check with recaptcha. Just show the robot check to users without a proper session, and allow content access only to the visitor that passes the test.
For example,
<?php
session_start();
if ($_SERVER['REQUEST_METHOD'] === 'POST' && isset($_POST['recaptcha_response'])) {
// Build POST request:
$recaptcha_url = 'https://www.google.com/recaptcha/api/siteverify';
$recaptcha_secret = 'YOUR_RECAPTCHA_SECRET_KEY';
$recaptcha_response = $_POST['recaptcha_response'];
// Make and decode POST request:
$recaptcha = file_get_contents($recaptcha_url . '?secret=' . $recaptcha_secret . '&response=' . $recaptcha_response);
$recaptcha = json_decode($recaptcha);
// Take action based on the score returned:
$_SESSION['human'] = ($recaptcha->score >= 0.8);
}
if (!isset($_SESSION['human']) || !$_SESSION['human']) {
echo <<<HTML
<html>
<head>
<title>Human Test</title>
<script src="https://www.google.com/recaptcha/api.js"></script>
<script>
function onSubmit(token) {
document.getElementById("demo-form").submit();
}
</script>
</head>
<body>
<form method="POST" id="demo-form">
<button class="g-recaptcha"
data-sitekey="reCAPTCHA_site_key"
data-callback='onSubmit'
data-action='submit'>I am a Human :-)</button>
</form>
</body>
</html>
HTML;
exit;
}
// Whatever content you want to show originally
// ...
// ...
This would block most of the scrapers.
Answered By - Koala Yeung Answer Checked By - Cary Denson (WPSolving Admin)