Fetch HTML part in java
up vote
0
down vote
favorite
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
add a comment |
up vote
0
down vote
favorite
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
java html inputstreamreader
asked Nov 20 at 10:20
Vlad Doronin
33
33
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41
add a comment |
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
accepted
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
add a comment |
up vote
0
down vote
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390833%2ffetch-html-part-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
add a comment |
up vote
0
down vote
accepted
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
edited Nov 20 at 11:22
answered Nov 20 at 11:14
piotr szybicki
423210
423210
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
add a comment |
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
Posted my code in next answer
– Vlad Doronin
Nov 21 at 12:06
add a comment |
up vote
0
down vote
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
up vote
0
down vote
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
up vote
0
down vote
up vote
0
down vote
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
answered Nov 21 at 12:05
Vlad Doronin
33
33
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390833%2ffetch-html-part-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 at 10:41